Scan all your books for a dollar each

I just used 1DollarScan in San Jose to scan a bunch of old yearbooks, text books, and manuals from General Motors I have kept for 35 years. There is no “catch,” it really is a dollar a book, but there are some conditions. To be fair, their definition of a book is 100 pages. So if you have a 101 page book, then its two dollars. If it’s a 199 page book, its still two dollars. A 1001 page book is 11 dollars. It is still the best deal you will ever see. Heck, the Post Office makes more than 1DollarScan if you have to ship the books.

KONICA MINOLTA DIGITAL CAMERA

Hiroshi Nakano at 1DollarScan will make high-quality pdf files or jpegs of all your books.

You can use media mail, however, and get a really cheap rate. I like the flat-rate Post Office boxes, and there is always UPS and FedEx ground. If you have a pallet of books maybe freight is the cheapest way. For folks like me that live in Silicon Valley, you can just drop the books off.

Now, if the books are copyrighted, 1DollarScan intends to dispose of them after they scan them, so that there is no copyright issues. You just changed the form of the copyrighted material you already paid for, and the Supreme Court has decided that issue decades ago. If the material is your own, or something like a high-school yearbook, 1DollarScan can return the material, if you pay for the shipping.

KONICA MINOLTA DIGITAL CAMERA

I had two big bags of books for 1DollarScan.

Since they count 100 pages as a “set” and every book is at least one “set” it did not take long for me to see that I had over 67 sets. I just stopped counting, since they have a 100-dollar a month platinum deal, where several premium services are included free.

So I want to point out the downsides since you are engineers and analytical. But first, rest easy, because Hiroshi Nakano, the founder of 1DollarScan is also an engineer. He came to Silicon Valley working for a big corporation. After a few years, he returned to Japan. There he noticed similar scanning services growing in popularity, since space is at such a premium in Tokyo. So Nakano returned to Silicon Valley and started 1DollarScan. As you would expect from a fellow engineer, the pricing is rational, the website is clear and it works great, and everything seems too good to be true.

As to those downsides? Well, since he uses very light compression on the pdf files, they are huge. My General Motors Institute yearbook came in at 350MB. That was 242 pages of high-res and mostly images. Because I sprung for the 100-dollar a month deal, the file was named with the title of the book and I can use their “tune up” online service to make smaller pdfs suited for phones or tablets, Kindles or other devices. Alternatively I believe they will provide you with the raw jpeg files, and for an extra dollar, they can do 600dpi jpegs. For me, the pdfs just make more sense, and hundreds of separate jpeg files are too unwieldy to handle.

The only other downside is the OCR (optical character recognition) was not perfect. I had 1DollarScan scan in a big 1960s magazine from the Cleveland Plain Dealer called “Cleveland, a city grows to greatness.” The preface has a small biography of the two authors. The type was tiny and the magazine was 50 years old. The page image is perfect, and you can’t see the OCR errors until you highlight and cut-and-paste the text into a notepad or some other editor. Here is what the OCR produced:

George J. Barmann, coculhor of l{ris work, has been on lhe staff of ihe Plain Dealer since epfember, 1942. He came lo the paper {rom the lllinoir State Journal, in Springfield, where he had gone affergraduation from flre University of lllinoir, in !937. On the Plain Dealer, Barmann spent some time in writing about education. After thal, he did general asignmenf reporting. whish meanr covering almoct the whole range of stories that daily come acro3r the City De*. Barmann, in recenl years, has done chiefly feafurer for the Plain Decler, including a greal many inierview with headline peronaliiies and people of the fheafer. Also, he ha; writlen feature stories aboul lhe Civil War. He lraveled through the Deep South, from New Orleans to Charleston, S.C., and wrole a series of arlicle: on whaf Southerners were thinling in tfii: l00th anniver:ary of ilrat wer. A nalive of Chillicolhe, Ohio, Barmann al*ended Miami Univer:ity ai Orford, O. before lramfering to iournalirm ai lhe Universify of lllinois.

Other fonts came out much better, this was the worse OCR of anything that got scanned. But there is a solution to both the big file size and the OCR accuracy. Based on the advice of analog engineer Walt Jung, I had purchased a copy of ABBYY Finereader 11. I am pretty sure it was under 100 dollars. I used ABBYY to scan in all my loose papers and tax records. I find it far better than TextBridge and other OCR programs, which I also own. ABBYY will take in a pdf file, and re-recognize the text, and save it with much higher compression. So I ran the 1DollarScan pdf into ABBYY and made another pdf file. That file of a 64-page ledger-size book was 9MB instead of 120MB. Here is the ABBYY OCR result of the Cleveland book:

G e o r g e J. Barmann, co­author of this work, has been on the staff of the Plain Dealer since September, 1942. H e came to the paper from the Illinois State Journal, in Springfield, where he had gone after gradua­tion from the University of Illinois, in 1937. O n the Plain Dealer, Bar­mann spent some time in writing about education. A fte r that, he did general assignment reporting, which means covering almost the whole range of stories that d a ily come across the C it y Desk. Barmann, in recent years, has done chiefly features for the Plain Dealer, including a great many interviews with headline personalities and people of the theater. Also, he has written feature stories about the C iv il W a r . H e traveled through the Deep South, from New Orlea ns to Charleston, S.C., and wrote a series of articles on what Southerners were thinking in this 100th anniversary of that war. A native of Chillicothe, Ohio, Barmann attended M ia m i University at Oxford, O., before transferring to journalism at the University of Illinois.

You can see ABBYY was much more accurate, but its problem is that it peppers extraneous spaces in the text. If you searched Google for George Barmann, it would find the 1DollarScan pdf but not the ABBYY pdf. This is because it is trying to line up the highlighted hidden OCR to the image of the text on top of the OCR. Since the font is a bit funky, is hand-typeset and has kerning, ABBYY breaks up words when it adds needless spaces. Both OCR results were a bunch of separate lines that I concatenated above so they would fit this post. Where the ABBYY version has hyphens, those are correct, there were line breaks there. Oh, I know, I can take a screen shot of the pdf images, here:

1DollarScan-OCR-sample

The 120MB 1DollarScan screenshot has way less image compression, if you click on the image you can see the author’s eyes much more clearly that the image below.

ABBYY-OCR-sample

The 1DollarScan 120MB pdf run through and re-recognized by ABBYY Finereader 11 is only 9MB, and the text quality is nearly as good. The ABBYY image quality suffers from the higher compression, so you should not erase the original 1DollarScan files.

You can see that the 9MB ABBYY is almost as good for text as the 120MB 1DollarScan pdf, but the image in the 1DollarScan pdf is clearly better. So for things like a yearbook, I definitely will keep the larger 1DollarScan file, and maybe make a ABBYY pdf out of that to send around or post online. I looked into the extraneous spaces in ABBYY and there seems to be no “cure”. I tried making a “tagged” pdf in ABBYY and it is just much bigger and even worse OCR.

OK, so you can see that 1DollarScan is the real deal, here is a photo montage.

KONICA MINOLTA DIGITAL CAMERA

Hiroshi Nakano examines the books dropped off for scanning at 1DollarScan.

KONICA MINOLTA DIGITAL CAMERA

One side of the warehouse at 1DollarScan is for books waiting to be scanned. Lead times are only a week or two.

KONICA MINOLTA DIGITAL CAMERA

Hiroshi Nakano uses this guillotine stack paper cutter to remove the bindings of your books.

KONICA MINOLTA DIGITAL CAMERA

Hiroshi Nakano shows the spine of a book that he has cut off with the guillotine stack paper cutter at 1DollarScan.

KONICA MINOLTA DIGITAL CAMERA

Here is the workstation where an employee at 1DollarScan feeds several scanners and once, while tending to paper jams and insuring you get the perfect scan.

I should mention that I asked Hiroshi Nakano if I should not show the heart of his operation above with the multiple scanners being fed by his employee. I told him that someone might see it and try to compete with him. Nakano smiled and said, “Nobody can compete with me.” I love the precision and factual nature of my fellow engineers, don’t you? Lets face it, a dollar to scan 100 pages with OCR is pretty remarkable.

KONICA MINOLTA DIGITAL CAMERA

Once the books are scanned and the pdfs are posted for your download, 1DollarScan holds your books for two weeks, in case there were any problems. After that, the copyrighted books are recycled, or un-copyrighted materials are sent back to you if you pay shipping.

KONICA MINOLTA DIGITAL CAMERA

Hiroshi Nakano from 1DollarScan patiently explained his operation to me and I was assured that it really is true that he can do high-quality scans of your books and magazines for a very low price.

There is real joy in being able to keep all my books in electronic form while dispensing with hundreds, maybe 1000 pounds of paper. Lets see if I can find a picture–

Paul-Rako-scanning

An engineers can collect a lot of paper. There were tax records for my business, project folders for jobs I worked on, letters from college girlfriends, owners manuals, and two big stacks of books you can barely see in the back right corner. The ammo boxes bottom right are full of pictures and negatives.

It took about six months, just scanning in all these loose papers. The hand-written letters from girlfriends I kept as 300dpi jpegs. Same for my hand-written printed notes, the OCR in ABBYY is pointless on handwriting anyway. For pictures, I scanned them at 600dpi, anything finer I could not see any difference on the 47-inch TV I use as a monitor. For negatives and slides I did 2400dpi, which is the same spatial resolution as doing the printed picture at 600dpi. I used ABBYY to make pdf files of any printed materials, including some booklets that I thought 1DollarScan might not want to do. But all the books, yearbooks, magazines, and manuals, well, it was just so nice to send those two big stacks to 1DollarScan and have it taken care of my by some diligent professionals. Lets face it, disk space is nearly free. I have a 2-Terabyte NAS (network-attached storage) at home that can hold all these files with room to spare.

Paul-Rako_scanning-desk

For scanning all my loose documents, I had a Canon laser MF 4890dw on the right. For 11×17 and oversize, a Brother MFC-J6710DW inkjet on the left. For pictures, slides and negatives a Canon CanoScan 8800F back-lit flatbed in the middle.

A standard desk would hold all three of my scanners, The laptop was driving my two TVs, and a wireless keyboard and mouse did the control. Lots of paper towels and Windex to keep the platens clean. It was a monumental job but now it is done. I will keep all my receipts and records on the NAS now. I back it up onto the laptops, and to a SSD (solid-state drive) I keep in the safety deposit box at the bank. I have auto titles and my birth certificate as paper, everything else is virtual. It is heaven. I gave away the two printer-scanners but kept the flatbed so I can do receipts and such as they come in. No more shoe boxes full of receipts for me.

Paul-Rako_scanned-documents

I made three runs to the Sunnyvale dump with the scanned paper. My pal said some companies will let you put your personal stuff in their shredder boxes. Either way, it is great to have all kinds of room, as long as I resist the temptation to fill it up with old test equipment or Sportster parts. And for getting rid of that stuff, you can use flea markets, Craigslist, and eBay.

Stock-engine-test_1969-Chevrolet-327-cu.in.-V-8-42p_ABBYY

Here is a link to a scan I had done by 1DollarScan and then ran through ABBYY Finereader. It went from 28MB to 2.7MB. The booklet a really cool engine test report I had from my student days at GMI. Since Mary Barra, the CEO of GM went to GMI too, hopefully she won’t sic a bunch of high-tone Detroit lawyers on me. I do note there is no copyright symbol on the document. Now the blocky shading on the title is due to the compression in ABBYY. The 1DollarScan document does not have those artifacts. But it is 10 times bigger. Note how the scan is straight and note how they scanned the both sides of the back cover, even when there was no text. You want the whole booklet scanned, they do it. Now they do charge 2 dollars for magazines, and I am not sure they counted this booklet as a magazine or thin book. In any event, that was why it was worth it to get the premium membership for a month. Hiroshi Nakano thought I was being fair and I thought the same about him. More than fair. Part of the premium service is they give the pdf file the name of the book. In this case they carefully typed: Stock engine test, 1969 chevrolet 327 cu.in.v-8, 42p.pdf. Since I am putting it on a web server, I changed the name to my convention: Stock-engine-test_1969-Chevrolet-327-cu.in.-V-8-42p_ABBYY.pdf. When pals talk trash about GM, I whip out this 45-year-old engine test. General Motors knows more about cars and good engineering than all the congressmen and lawyers put together. I still miss being an auto engineer.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s