What is Reddit's opinion of Portia?
From 3.5 billion Reddit comments

➔ Portia website

By popularity on Reddit, this Service is:

4 reviews of this app found across Reddit:

really now? we are already on 4th iteration of Python 3 and new product gets released exclusively on 2.7 ? This has to be an April fools joke...

Since all the pages you want are in subdirectorys you can download all of them with a spider really easily.

If you are unfamiliar with Scrapy, Portia has a gui you can use and try out.

They will all be in .html format, you can then find the relevant headers and extract the information you want or just convert them straight into PDF files with wkhtmltopdf.

I used Portia, which was surprisingly easy to set up and then wrote an R-script utilzing str_extract and regular expressions to extract the age (basically by identifying sentences containing "age" and/or "-year" and then extracting the ages from those sentences). I then had to do some manual cleaning and delete sentences such as "The four sibling were living together" from the data set.

Hi, it's me again :-) I was on my phone at the time and couldn't load your link, and I hoped someone hanging out here would be able to help you

Now that I've seen the content, I'm sorry to say it's just going to be a grind. They are one of the few websites left in the world that doesn't use a javascript API (meaning loading the data would be super, super easy), and they are so old that there aren't any meaningful labels in the page source that would give away the "field" versus the "value," at least not in a way that a computer can easily tell. For example, the "Mailing Address" spanning 2 table cells is the kind of irregularity that drives computers crazy. Then the table underneath the main one switches from horizontal label-value to vertical label-value. That kind of stuff.

But the good news is that there doesn't appear to be very much hidden content, by which I mean data that only in the page source.

I do hear you that programming is not your strong suit, but take a look at Scrapely and its friend Portia and see if any of the words make sense. It's hard to judge if those links are interesting, helpful, or just intimidating, because I don't know your background.

Separately, there have been several products/browser extensions/etc that have claimed to do point-and-click page extraction, but I don't have enough experience with them to recommend one over another

But, as I mentioned before, feel free to come back and ask more questions, as this stuff really is good fun and is really empowering, it just takes a little getting used to asking the computer in the right way

What is Reddit's opinion of Portia? From 3.5 billion Reddit comments

➔ Portia website

By popularity on Reddit, this Service is:

4 reviews of this app found across Reddit:

What is Reddit's opinion of Portia?
From 3.5 billion Reddit comments