Huh, I was trying similar solutions in the past to free myself and my personal office from paper and thought I had tried open-paperless already but then noticed that your first commits where about half an hour ago.
Then I noticed: there are similar projects with really similar names: paperless (selfhosted) and open-paperwork (desktop).
I'm gonna give open-paperless a try, allthough I'm trying to build a personal document management system app that integrates with nextcloud.
Thank you!
After hunting for the same solution for the same reasons, I ended up going with paperless. A proper edms, even Mayan, is way too much for my needs. I was able to setup paperless and start scanning items in, in around two hours. Paperless does everything I need, namely, an easily searchable archive of scanned items. Let's you download (to email) a pdf afterwards. Has a web front end for search and tag/document management. This is all I needed, and I am so far happy using paperless.
I say start off using paperless, and if you find you need more, then look for additional software. I'm sure you'll be well served as a home user with paperless https://github.com/danielquinn/paperless/blob/master/README.md
You could try Paperless or OpenPaperwork and just tag the receipts with a Receipt
tag
https://github.com/danielquinn/paperless
https://github.com/openpaperwork/paperwork
Both use tesseract for OCR, which is a bit of a mixed bag
Eventuell Paperless? Habe ich selbst noch nicht benutzt, das benutzen aber anscheinend viele.
Verschlüsslung kann es mit GnuPG und es ist optimal um auf einem RasPi gehostet zu werden. Die Struktur und das mobile Layout kenne ich nicht, aber das kannst du dir ja genauer anschauen.
I have a lot of handy scripts in Python, but the biggest thing I've done is probably Paperless. It's a means of keeping track of all of my paper documents by OCRing and indexing them so I can search for them later.
I think that would be a great idea!
Maybe in connection with paperless? I thought about setting it up because it seems to work pretty well.
You should have a look at my Paperless project. It operates on a loop, consuming PDFs and OCRing the image into text.
Once it's in text form, I suppose you could write something to determine the file name, or you could just apply the correspondent & tag rules inside the Paperless UI. Whatever you do, it might be a good place to start.
For the more technically minded I've started using this - https://github.com/danielquinn/paperless
Have it set up on my Raspberry Pi. It OCRs and encrypts all the documents. Allows for searching, tagging, sorting by correspondants and gives a nice interface for downloading whichever file you need without having to worry about searching through files and folders. I'm really enjoying using it
An "API" is on the way in the sense that I've added a simple HTTP POST URL (still in the testing branch). You can send a POST with the sender, title, document (file), and a signature generated with a shared secret (to prevent abuse) and the server will dump the file into the consumption directory to be consumed as part of the standard indexing process.
If you'd like to try that out, feel free to take a look at the images-as-docs branch and give it a shot. I hope to have this stuff merged into master this week.
I write web software & maintain a few Free projects like Aletheia and Paperless:
I have a few clients & projects, so there's some overlap here as some systems were in place before I arrived. Generally though, I'm a huge fan of GitLab and prefer AWS to Heroku.
I use and like https://github.com/danielquinn/paperless
I scan documents from my scanner straight to my NAS share, paperless detects and imports them.
It has a nice open source macOS desktop client https://github.com/thomasbrueggemann/paperless-desktop
Came here searching for the same thing :)
So far, this is the best self-hosted solution I found (but would be grateful for other suggestions before I start installing): https://github.com/danielquinn/paperless Includes OCR , indexing and search out of box.
There is also MayanEDMS, but it is a full fledged DMS, bit of a overkill for home use.
Oh wow, I just realized this is the most bored and boring my homelab has ever been. Used to have a full stack of Cisco gear for testing but VIRL let me remove that, and then I didn't renew VIRL...nuts.
Current
Physical
Virtual
Plans
So that should keep me busy through December!
I've for a long time kept pdf's just stored on my nas, but considering switching to paperless since it would be much more scalable in the long run including things like search. Next rainy day I have, I'm likely going to take the plunge.
Looks very cool.
> Check out this repo to somewhere convenient and install the requirements listed here into your environment.
Any particular reason there's no requirements.txt
file?
Edit: Found it.