Quantcast
Channel: uncategorized - Forum - FlexGet
Viewing all articles
Browse latest Browse all 376

Parse a complete html file

$
0
0

@Tarrasque wrote:

Hi all.

I need to get entries from a web site which unfortunaly is VERY VERY badly made.

Urls and titles are nowhere close enough to be processed by any of the usual plugins. Plus, the HTMismarmad in various places (unclosed tags and such) so even a XML parser is nearly useless.

The only way I found so far to extract any meaningful info from those pages is writing a python script that using BeautifulSoup and CSS class queries. I am able to reconstruct title / url pairs.

So, what I'm asking is: is there a plugin that allows me to feed a whole page in a custom script and pass the output to the rest of the Flexget task to process?

Posts: 1

Participants: 1

Read full topic


Viewing all articles
Browse latest Browse all 376

Trending Articles