Skip to content

ivarvong/googledoc_markdown

Repository files navigation

googledoc_markdown

Circle CI Build Status Dependency Status Code Climate

Why?

At The Marshall Project, stories are edited in Google Docs. I wrote a quick tool to convert the HTML export from a Google Doc to Markdown. (Internally, our stories are stored as Markdown). Turns out, parsing CSS with regexes is not a great idea. This gem is the next iteration.

Here's the strategy:

  1. Inline the CSS for font-weight: bold; and font-style: italic; based on the .c01 (etc) classes with the roadie gem.
  2. Parse the inline styles into a hash of CSS properties with the css_parser gem.
  3. Wrap the <span> with either a <strong> or <em> based on the CSS properties on it. A single <span> may get wrapped multiple times if the text is both bold and italic, for example. Then remove all the <span>s.
  4. Pass this cleaned HTML to kramdown to yield markdown.

Installation

Add this line to your application's Gemfile:

gem 'googledoc_markdown', github: 'ivarvong/googledoc_markdown', tag: 'v0.1.1'

And then execute:

$ bundle

Usage

This gem is not stable and probably shouldn't be used yet. The spec might be useful.

require 'googledoc_markdown'

converter = GoogledocMarkdown::Converter.new(html: your_google_doc_html)
markdown = converter.to_markdown

Development

After checking out the repo, run bin/setup to install dependencies. Then, run guard to run the tests.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/ivarvong/googledoc_markdown.

License

The gem is available as open source under the terms of the MIT License.

About

Not ready for production. A Ruby gem for converting Google Doc HTML to Markdown.

Resources

License

Stars

Watchers

Forks

Packages

No packages published