/u/bakuretsu is talking about avoiding programming all together, not about rewriting this in something that compiles natively. This project could be handled by writing extra man pages and using the standard man command to read them.
If you want to do that, you could use Pandoc to convert the markdown tl;dr's already written in your project to groff, which man can display. You could put the pages in their own section, called tldr, say, so that they could be used as
man tldr tar
I've seen MS Word used, and it freaks me out every time. LaTeX is the most common sane way to do it, and I try to encourage every undergrad I know to get started on it earlier rather than later. AIAA actually has its own LaTeX template that you have to use when writing any papers you intend to submit to any one of their conferences or publications. I think other disciplinary organizations probably have theirs too.
Recently though, ~~Markup~~ Markdown is gaining a lot of traction because there are some document processors out there right now that combine its simplicity with inline LaTeX for mathematics. It allows people to write the actual body of their papers/thesis without getting bogged down in LaTeX's verbose syntax, while simultaneously taking advantage of its strength in parsing equations. YMMV of course, but this is my personal observation at least in my particular subdiscipline of Aerospace Engineering.
I've given it a try myself and I have to admit, it's growing on me. I'm particularly partial to Pandoc.
Edit: Corrections/clarifications.
The awesome Pandoc can convert markdown to groff, so you can easily use the tl;dr's written with this project as man pages instead, and not install the JavaScript code or node.js at all.
You should definitely give pandoc a try. For not too complicated documents it's extremely usable and can convert for example from TeX to .docx. I generally write in Markdown and export for further use into TeX.
Here's a gist on GitHub with a tiny sample document and the make file I use for starting all my documents.
If you’re familiar with markdown and RST give pandoc a try. It is a nifty many-to-many document coverter that abstracts away a lot of the clumsy markup in LaTeX, but it’s still fully featured and e.g. extends markdown to allow including floated figures, tables, etc.
You could write your documents using markdown or something else, then convert it to .doc
using pandoc.
It also supports a lot of other input and output formats.
Still early days [for me! not the plugins], but i use pandoc with:
NeoBundle 'vim-pandoc/vim-pandoc' NeoBundle 'vim-pandoc/vim-pandoc-syntax' NeoBundle 'vim-pandoc/vim-pandoc-after'
Pandoc uses an extended markdown format which goes very well with LaTeX or HTML:
http://johnmacfarlane.net/pandoc/demo/example9/pandocs-markdown.html
If someone wants to see a large application, Pandoc makes extensive use of Parsec to parse many lightweight text markup formats. This is the type of data where a Parsec parser shines, because the syntax is often irregular and context-sensitive.
Use vim + pandoc. You can write using some markup and output PDFs (if you have LaTeX installed), ODTs (the format libreoffice uses), RTFs, HTML...
I'm using that for my Master thesis (in philosophy). It's great. I recommend using pandoc's bundle for vim and my wrapper for using pandoc + beamer (for presentations).
Also: learning some LaTeX isn't a bad idea (actually, it might help for using pandoc), but markdown is vastly better for writing. (disclaimer: I wrote my bachelor's thesis using LaTeX).
I recommend http://johnmacfarlane.net/pandoc/
I used to do most of my schoolwork in markdown, which it renders into a PDF much like if I had used LaTeX, but I found it less cumbersome to read the plaintext markdown than if I had to open up a PDF or read it in LaTeX.
However, you could use pandoc to read LaTeX and write out in formats that easily translate to Word.
For Grey (or anyone else interested), there is a wonderful little command-line utility called Pandoc which is able to convert to and from many, many markup formats (markdown, HTML, LaTeX, Word, ePub, PDF, etc.) so it will handle just about any text conversion task you throw at it.
Also, for those who want to switch to a more text-based workflow, I highly recommend the site Plaintext Productivity, which describes how to set up your to-do lists, notes/drafts, file system, and backup / version control for plaintext on Windows (though most of it is easily implemented on other platforms as well).
You don't even need to learn Latex any more. Just use reStructuredText (similar to markdown with more options) and that can easily be converted to a full Latex book with tools such as Pandoc
Entire books have been written this way.
I don't think it's worth writing another convertor. markdown.pl works fine, and vim is designed to work with command-line utilities.
By the way, check out pandoc; it converts between many formats.
From command line:
> pandoc test.md | tidy --show-body-only true 2> /dev/null
or from vim with a visually selected block:
:'<,'>!pandoc test.md | tidy --show-body-only true 2> /dev/null
>How can you compare Markdown to word processor?
Why not?
You can create headings, numeration lists, bold text and some more.
Latex integration allows you to use even more features (formulas, ...).
That's enaugh for most needs - including academical papers.
pandoc can convert your text files to pdf (using latex).
You get the easy syntax of Markdown and the powerful features of latex.
Have a look at this:
I've recently stumbled upon a tool called plasTeX. It doesn't convert to ePub directly, rather, it converts to HTML. After that, you can use pandoc to convert the HTML to ePub. I was happy with the result for a simple document, though you might run into problems with your book.
It's important to say that pandoc also converts from LaTeX to ePub, but I wasn't very happy with the results.
However, I agree with you completely, we need a solid LaTeX to ePub converter. Pandoc seems to be progressing rapidly, I hope it evolves into a viable solution soon.
My favorite one is pandoc (http://johnmacfarlane.net/pandoc/). It has a lot of input and output formats. I use it to convert markdown (similar to reStructuredText, but better in my opinion) to LaTeX and HTML.
Using Pandoc or Transmuter, you can turn Markdown into LaTeX.
Lets you write the big bulk of your copy in markdown, and then improve the parts that need to be improved in LaTeX and have a fantastic output.
I used this all through college to great success
> LaTeX
An offering of blood, bone, and a piece of your soul to output beautiful documents. Side note, if you want something that looks good, but not as ball-busting as LaTeX, check pandoc if you have not already. It's written in ... Haskell :D. Using Makefiles I've beena ble to organize some 'nice' documents. For anything very complex I wouldn't use it though.
Anyway, sorry for the rambling!
But they will accept a PDF. I'm sure you can do this with some kind of org export. For the first half of grad school (anthropology), I used vim and copy-pasted to LibreOffice. Then, I found pandoc . Now I write in Emacs or vim and have a pandoc template that approximates the AAA style closely enough. Recently wrote up a little pandoc-faces.el (unpublished) to highlight pandoc-citeproc-style citations if anyone is interested.
Oh, and to your point about newlines: I'm pretty sure pandoc demarcates paragraphs with one blank line, so they may be wrapped or have a bunch of new lines.
XMonad is a graphic application, written in Haskell, that I'm quite happy to use. Pandoc is my preferred markup conversion toolkit -- and I think there is a large community using it. I used darcs as my preferred decentralized version control for years. It has unique ideas that makes it a pleasure to use, some youth issues, but sadly it lost steam after losing the popularity war against Mercurial and Git and grew up increasingly isolated in the ecosystem.
I don't think any of the software I use daily is implemented in Java -- I use Libreoffice sometime but that's about it. Certainly none of it is implemented in C#. (Javascript, I suspect, cannot be avoided.) I think your assumptions may be overly restrictive of what people do with their computers.
Yeah, I was just going to suggest a markdown plugin. Plus you can always use pandoc to turn it into an acceptable .tex file afterwords afterwards.
Dark blue text on a black background (your chapter titles) is generally inadvisable. When I'm reading a story, I tend to prefer dark text on a lighter background, but there are plenty of scripts that will allow people to have the option.
I'm glad that you don't have the nude furry picture at the top of the page. I might hide the picture by default and have a "click here to see NSFW picture" where the picture should go in the text, and have some javascript show the picture.
For typography, please do use hyphens, em-, and en-dashes correctly. If you use a hyphen where an em-dash should be, on first read-through I will often read it as a hyphenated word, and it will take me a moment to realize that there should be a break in the sentence—confusing!
Here's a good reference on the different types of dashes
I prefer to write in pandoc-styled markdown, then use the cross-platform, free, and absolutely amazing pandoc utility to convert to practically any format under the sun, including html and epub. Pandoc even supports custom templates for all of its output file formats!
When using pandoc, I use the -S
(uppercase S) or --smart
flags (they're the same) which:
> Produce typographically correct output, converting straight quotes to curly quotes, --- to em-dashes, -- to en-dashes, and ... to ellipses.
Writing in markdown means writing in plaintext, which means that I am not tied to a proprietary or closed-source application. It's safer even than ODF in that I expect computers will still be able to read the same .txt
file 10 years from now, with no loss of fidelity! I can also manage my story through source control such as git, out of which I get versions and (free) backups through github if I want them.
Anyway, that's my advice—thanks for writing!
You might be interested in pandoc. This way you can use basically any markup language you like.
With latex font size, margins, page numbering and inserting images are all very easy and basically just one line in the beginning, so another option would be to take any markup language you like, export to latex, insert your custom header (write once use all the time), export.
You could try the advice here, which basically amounts to using a crawler to download the site for offline use. If you absolutely need PDF or Mobi, you might see if pandoc can handle the conversion. Similarly, you can cherry pick which pages you want exported (as XML) with Special:Export.
caveat I haven't actually tried to do either for the Haskell wiki specifically. I might in a day or two, but as I've just finished driving for 8 hours, I don't feel like doing much of anything at the moment, sorry.
I am fine with TexInfo. It would be nice to see some integration with Pandoc for converting to HTML, print media, plain-text, et cetera. But as an input format I don't have a personal problem with TexInfo. I understand why people would suggest Org, but TexInfo is geared towards technical documentation, while Org is a lot more broad and therefore lacking in some regards.
> Markdown into derived formats (EPUB, PDF, ...) is based on HTML
Not necessarily. Pandoc can convert markdown (and docbook) to tex/latex. Some flavors of markdown contain some extra syntax for things like references that are needed for certain types of writing (e.g. academic papers). Pandoc can also do markdown to asciidoc or docbook, not that markdown can produce everything you might need in those format, but it could potentially be used as a starting point.
Seeing LaTeX as a way to "automate" certain tasks might be the wrong approach. Writing formulae is tedious in any editor, language or application. Nothing beats pen and paper here.
In case you want to mix the worlds of Markdown and LaTeX, I'd suggest you take a look at Pandoc and especially this usage example.
I've heard good things about Pandoc for mkd2pdf, although I've not used it myself. I just keep things in the text format and only convert to HTML when I need to put it on the web.
I definitely second LaTeX for general typesetting, but I'm not sure how easy it would be to export to an ePub file. I'm sure there is some sort of tool that does it, but I don't know one. This might be useful, but I've never tried it: http://johnmacfarlane.net/pandoc/
You don't need to work with LaTeX neccesarily for such a thing. Check out pandoc. With it, you can write using a minimal markup in any text editor and then produce your document as a odt or pdf file. (or as a presentation, e-book, etc. It's very versatile)
Pandoc can actually use LaTeX for the rendering, but it's not necessary, as i said.
EDIT: And it was written by a (IMO) quite good philosopher, John McFarlane.
It's not normal to be stuck on a make for 25 minutes. As for how to proceed, you might want to install the Mac package instead of using Brew. There's a link here:
I would recommend to write documentation in a markup (or markdown) language so that you can easily export your content to multiple formats. Don't people read documentation in all kinds of formats? I would think you would at least want to have documentation in HTML, too.
A local favorite is pandoc markdown which is a great utility.
IMHO technical documents should be stored in the reStructuredText (which I prefer to Markdown because of its metadata). You can use Pandoc to generate DOC or PDF files from that.
Those files should be commited either as global docs or docs in a specific folder of the application.
As for the rest, I can't really say because I've never done that...
Hi, I created editR. I'm glad it is useful for other people.
To answer your question, yes, pandoc is a requirement. I guess I should have mentioned it in the installation instructions (will fix this as soon as I can). I use the 'render' function from the rmarkdown package to render the .Rmd documents and it relies on pandoc. The preview is rendered with the 'knit2html' function from the knitr package. It's faster than calling pandoc each time but it has more basic functionalities and less versatility than pandoc when it comes to render the formatted document.
Pandoc is fairly easy to install (http://johnmacfarlane.net/pandoc/installing.html). If you have a Mac I'd recommend installing it via Homebrew or MacPorts. Also installing pandoc-citeproc comes in handy if you want to include bibliographical references.
If you are going to post it on reddit (as a self-post or in this subreddit wiki) you will need to use Markdown formatting. After a quick Google search I found Pandoc which should be able to convert a Word document to Markdown format, which you can then copy/paste to reddit (I have no idea how well this actually works)
I did not know pollen till now and I must say it looks pretty neat. However, you might want to have a look a Pandoc which can convert from Markdown and may other formats to LaTeX, HTML, ePub etc. I haven't gotten to toy around with it much, yet, but it seems to be able to do the task.
There are a couple of very useful tools which are missing in the article:
> have you tried writing an actual program that transforms markdown (of whichever flavor) to, say, LaTeX? This is ridiculously difficult.
The program you are struggling to imagine is pandoc and it works great. I can get book-quality PDFs from Markdown docs with Pandoc.
Certainly Pandoc is better in general. But for converting TeXInfo? Makeinfo is specifically designed to do that, for Pandoc it's just one of many formats. Even on Pandoc's demos page (http://johnmacfarlane.net/pandoc/demos.html) it shows makeinfo being used to generate the normal TeXInfo outputs.
Perhaps I'm wrong though, I haven't done a side-by-side comparison.
Pandoc has a pretty good Readme
I'll be honest though I don't know Pandoc Extended Markdown, I use Pandoc as some sort of magical universal converter
LaTeX can be converted, with a fair bit of extra effort, to a DOCX with Pandoc. It sounds like, for your purposes, Word is working fine.
To your second point, yes, LaTeX can do all of that but there's a bit more to learn than with Word. This StackOverflow question describes two approaches to doing so: (1) using the package titlesec and (2), using base LaTeX. Titlesec seems pretty nice!
From the chapter title typeface, and that the document generator was dvips and ghostscript, I reckon you wrote this in LaTeX.
You can use pandoc to convert LaTeX to EPUB - or you could use almost any LaTeX to HTML converter, as EPUB is basically HTML with an XML table of contents in a ZIP file.
pandoc comes with a script called markdown2pdf that will use pandoc to convert your markdown files into latex then call latexpdf on the results to produce surprisingly nice PDF files.
I've been looking for something similar to convert LaTeX into HTML, and I just found Pandoc looking for your answer instead. Converts several document formats into several other document formats. With it, I would recommend formatting your manuscript in LaTeX and then trying out Pandoc to make an ePub.
For the language, any backend language would do. Java, Python or Go would work. But there are a bunch of others that would work too.
For the pdf generation, if your backend is Linux, there are lot of different solutions to generate PDFs. You could try pandoc or text2pdf, for example. There is also a utility called unoconv that uses Libre Office to convert between different formats. I’m sure there is a dozen other ways to do it too.
Whar kind of computer programs are they? DOS-Skripts for automating stuff? These are the only scripts that should cause some effort.
For texts as such you might find pandoc useful.
The script itself is quite simple (just some regex stuff and lots of external calls to pandoc). It should be trivial to rewrite in Rust and include in a build process (or in rustbook).
There is a lot of room for improvement, too. I've not added any custom styling or much epub metadata.
BTW, there's a homebrew
formula for gnumeric
, which includes <code>ssconvert</code>.
For Word documents, there's <code>pandoc</code>.
>Pandoc… can read markdown and (subsets of) Textile, reStructuredText, HTML, LaTeX, MediaWiki markup, Haddock markup, OPML, Emacs Org-mode, DocBook, txt2tags, EPUB and Word docx; and it can write plain text, markdown, reStructuredText, XHTML, HTML 5, LaTeX (including beamer slide shows), ConTeXt, RTF, OPML, DocBook, OpenDocument, ODT, Word docx, GNU Texinfo, MediaWiki markup, DokuWiki markup, Haddock markup, EPUB (v2 or v3), FictionBook2, Textile, groff man pages, Emacs Org-Mode, AsciiDoc, InDesign ICML, and Slidy, Slideous, DZSlides, reveal.js or S5 HTML slide shows. It can also produce PDF output on systems where LaTeX is installed.
It doesn't even need homebrew
, apparently.
It doesn't seem likely that anyone with the skill set to auto generate a CV would apply to anything that requires a .docx CV. That said, I've never used it before but pandoc seems to solve this problem. It can convert latex to .docx. Just "os.system" or "os.popen" pandoc with the relevant files.
That seems like a pretty odd approach. Different editors for the whole process and not just for the final conversion/editing? Most writers really tend to live within them, and a difference in shortcuts and capabilities would drive them insane. What do you mean, I can't do regexp search-replace in MyMobipocketEditor? (Quite a few writers keep DOS machines alive so that they can still work with WordStar or XyWrite)
Given the linux ecosystem, why not get really friendly with one of the many great free editors and just convert for the final format? If your text isn't highly specific (math-heavy papers, screenplays), this isn't too hard. You could use something like Emacs org-mode, or just write in markdown and convert with pandoc.
I would also assume that writers would be interested in setting up distraction-free environments or ways to collect and sort information. (The latter being one reason why Scrivener is so huge in some circles.)
Thanks, handy guide.
I also saw a bunch of XLS files, so those should be extractable.
I've also got a few tools for converting / extracting documents, including <code>pandoc</code> which will convert between many, many document formats, and <code>pdftotext</code> (part of Poppler, a Linux toolkit), which will do a pretty good job of pulling text from documents. There are also some OCR tools, though I've been known to simply resort to re-keying data if necessary.
Pandoc, BTW, is simply fucking amazing:
> Pandoc can convert documents in markdown, reStructuredText, textile, HTML, DocBook, LaTeX, MediaWiki markup, OPML, Emacs Org-Mode, or Haddock markup to
> * HTML formats: XHTML, HTML5, and HTML slide shows using Slidy, reveal.js, Slideous, S5, or DZSlides. > * Word processor formats: Microsoft Word docx, OpenOffice/LibreOffice ODT, OpenDocument XML > * Ebooks: EPUB version 2 or 3, FictionBook2 > * Documentation formats: DocBook, GNU TexInfo, Groff man pages, Haddock markup > * Page layout formats: InDesign ICML > * Outline formats: OPML > * TeX formats: LaTeX, ConTeXt, LaTeX Beamer slides > * PDF via LaTeX > * Lightweight markup formats: Markdown, reStructuredText, AsciiDoc, MediaWiki markup, Emacs Org-Mode, Textile > * Custom formats: custom writers can be written in lua.
In particular, you can go from HTML/Markdown to LaTeX, and from there to ePub, which is really slick.
Pandoc us a wizard at this stuff. It supports an enhanced version of Markdown and LatexMath, you can use your own style sheets, and it will create the ePub and PDF for you.
I'm currently playing with autovala. The underwhelming state of support tools is probably the biggest thing holding me back from playing with Vala in serious projects.
The version of pandoc on my 12.04 based laptop appears to be too old. I installed the latest version (apt-get install cabal-install happy alex), added $HOME/.cabal/bin to $PATH, re-ran cmake, and was able to install.
Also, I really don't understand why AutoVala splits the tool into a binary and a shared library. It doesn't seem like the sort of tool to want to offer an API. The C header file is also generated from Vala source - probably not useful to 3rd party developers.
The app looks really interesting. I’ll be testing it out over the next several days.
I realize that this might be beyond the scope of your app but I would really like to see an app like this incorporate Multimarkdown and Pandoc.
MultiMarkdown
http://en.wikipedia.org/wiki/MultiMarkdown
Pandoc
Something else I might recommend:
Instead of vim-markdown
, have a look at <code>vim-pandoc</code>.
Pandoc's markdown format is an extension of Markdown — in other words something equivalent to MultiMarkdown — but I find that the vim-pandoc
plugin is on the whole better written and maintained than vim-markdown
. And of course, it works for "plain" Markdown files as well.
As an additional bonus, it automatically grabs *.md
for itself, meaning you don't have to worry about that.
You can certainly clone it. I run a Gitit instance on my laptop, then push changes from it to my personal wiki that runs on my desktop machine in my office.
You'll have to work out some sort of cleverness if you want to serve individual pages or sets of pages as "subwikis". I mean, the individual pages are just (mostly Markdown) files, but Gitit serves content from the actual git repository, not the checked-out working copy.
Off the top of my head: I guess you could have a branch that represents the "subwiki". The "visible" site that Gitit serves is just the contents of the git repo, wherever HEAD is pointing. Hence, it should be easy to have a branched repo that you can clone, and yet the two clones display different versions of the contents.
What's very nice about Gitit is that it's built on top of Pandoc (by the same author), which means you get all of Pandoc's format conversion "for free". For instance, you can author pages on a single Gitit wiki in multiple formats (and even do things like write Markdown with bits of LaTeX inline). And your users can export pages again in a variety of formats, including PDF.
For instance, I use my Gitit wiki for internal training courses. I write the course notes in LaTeX (and/or Markdown), and the students can simply download them as PDFs to print out.
It maybe simpler to convert the rtf into markdown which is much easier to parse. You can convert the rtf document using pandoc
From there, you can choose whichever language most suits you to generate your formateed file.
Haskell has a great library call Pandoc for text format conversiont you can read more about it at is website or read teh documentation at the hackage page.
The markdown syntax is slightly different howver then reddits. For super script you need to '^' on both sides of the superscript.
"super^script^" -> super<sup>script</sup>
Similarly with subscript with '~' replacing _
"sub~script~" -> sub<sub>script</sub>
<em> is also used in place of <i> and <strong> in place of <b>.
import Text.Pandoc.Writers.HTML import Text.Pandoc.Parsing import Text.Pandoc.Shared
markdown2HTML :: String -> String markdown2HTML = (writeHtmlString defaultWriterOptions) . (readMarkdown defaultParserState)
pandoc does a pretty good job of converting markdown to docx files. The only problem I've run into using it is that you have to then open the file in word and change the theme since the default one doesn't look that great.
I think there's a lot of markup formats out there because, in days gone by, you were often stuck with writing in raw html, docbook, LaTeX, Texinfo (for GNU docs), or groff (man pages). There was a backlash, and now you've got (in no particular order) asciidoc, reST, Markdown, Perl's POD, Ruby's doc syntax, moin wiki syntax, and perhaps others.
The solution Pandoc provides is that you can write your docs in just one markup format, and then convert to others as-needed.
Out of all the formats I've used, pandoc-markdown looks and feels the best. Again though, YMMV.
Yeah. Here's the relevant Pandoc docs regarding backslash escapes. Pandoc keeps it simple (as does reST, afaik): backslash escape any punctuation you want taken literally.
Pandoc is one of the nicest things I came across in DTP in recent years. I switched most my writing to (pandoc's flavor of) markdown, and happily convert to html, odt, tex or texinfo. I also write all my slides with pandoc and beamer. I wil probably look more into pandoc scripting. Plus, my emails look better now that I write so much markdown.
You're welcome :-). Thanks for using Pillow! The interesting news here is I finally got around to pandoc-i-fying the old documentation and uploading to RTD. Thanks RTD and Pandoc folks.
I've kind of wondered if it would be straightforward to use/extend Pandoc to write slides in Markdown and then convert to LaTeX, but haven't gotten a chance to look into it.
I think that you don't understand pandoc + markdown:
1) It extends markdown
2) You can insert native latex
3) It "compiles" to latex (or many other formats each of which can include native code iirc)
It just makes a lot of repetitive things simpler IMHO.
It's super useful, a little more intelligent than some other similar programs. It tends to be able to recognize when there is a larger image at the top or bottom of the page and split things accordingly. Ps. you didn't mention what operating system you are using. I've used fbreader to preview epub documents on Linux. For pdf though, any pdf reader will do. Chm i don't know.. I've converted chm to pdf before using chm2pdf, but you could also just "decompile" the chm file using some utility (not sure) and then use an HTML-to-pdf converter. I use pandoc for this.
I have WriteRoom, which is kinda similar to Ommwriter. And Maruku does markdown-to-TeX, which may be useful, and Pandoc does markdown to ConTeXt.
I've got a Why use TeX? post on my web site also.