[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Good html to text rendering



On 2018-08-13 14:33, Mike Gran wrote:
On Sun, Aug 12, 2018 at 08:41:47PM +0100, Nuno Silva wrote:
On 2018-08-12 18:44 +0000, Mike Gran wrote:
> I was looking for a good html to text renderer to put in, but, the ones I've tried
> (like html2text in Python) are kind of boring and do a bad job with tables.
>
> Right now, the best one I've found is piping the text through lynx.
>
> What is the greatest html to text rendering chain?

Do you have some example whose link you can share? Something with tables
and other items you want to come out right.

I don't really have anything to show yet.  I had made a couple of
simple test entries with the HTML I wanted to support, and then tried
Python's text2html on them, which made valid Markdown but wasn't the
best visually.

I probably will just go with piping html through lynx for now. My only
argument against that, which was pretty weak, wass that I didn't want
the hassle of putting the lynx executable in my Docker container.

I was looking for a solution to bring content from random websites to gopher. I ended up using readability. It extracts the content part of many websites.

https://github.com/cantino/ruby-readability

I brought hacker news to gopher here:
gopher://codevoid.de/1/hn

On some entries, you see a selector "text version". These are created via
readability.

I'm not sure if this is helping you, but I thought I throw it into the mix. It's probably overkill. But it would allow you to create a gopher menu that
uses the same text as the HTML version.

Personally, I would suggest to do it the other way around. Create gopher and proxy it to HTML. My gopher hole is also accessible via https://codevoid.de

Works great, looks great (I think) and it saves me from all the conversion hassle.

Best Regards,
Stefan


Reply to: