There's a lot of excitement among family historians over the National Archive's April 1 release of the 1950 U.S. census. If you're one of those people eagerly anticipating this event, there are two things you need to do over the next two months to prepare.
Legacy's census search tool |
Knowing where an individual lived in 1950 is likely going to be critical to your search. So the second thing you need to do is locate an address (it might only be a street name) for each person or family. There are a number of resources, but telephone books will be especially useful. The Library of
You might recall the massive indexing effort done by FamilySearch for the release of the 1940 U.S. census. Over four months, 163,000 online volunteers tried to decipher good, bad, and ugly handwriting, but no one did it alone. If there were a problem reading a name, others could be called on to weigh in on the matter. An OCR program doesn't have another OCR program checking its work, so there's a lot of room for error. For example, if you know your Wilson relative was alive and well in Zanesville in 1950, you might be surprised when you can't find them in the index. That's because the OCR read the census taker's handwriting as "Nelson".
Congress has digitized a number of telephone directories Library of Congress U.S. Telephone Book Collection, Although the collection is not complete, you might just be one of the lucky ones. The address is needed to narrow your search to the Enumeration District where the family or individual lived, but even without an exact address, you can use Steve Morse One Step website to help locate likely E.D.'s Steve Morse One Step/1950 Enumeration Districts.
"My 1950 US Census Release To-do List" from climbingmyfamilytree.Blogspot.com |
Why, you're probably asking, do I need this information in order to search the 1950 census? It's because there's a possible problem with the index. Initially, the National Archives said there would be no index available at the time of the census' release. Then in December, we got the good news that there will be an index released on April 1. The potential problem lies with the fact that the indexing is being done not by humans but by a technology called Optical Character Recognition. While humans are fallible, so is technology, and indexing the 1950 census requires creating an OCR program that can "read" all the different ways in which census takers might have formed their letters as they recorded respondents' names.
You might recall the massive indexing effort done by FamilySearch for the release of the 1940 U.S. census. Over four months, 163,000 online volunteers tried to decipher good, bad, and ugly handwriting, but no one did it alone. If there were a problem reading a name, others could be called on to weigh in on the matter. An OCR program doesn't have another OCR program checking its work, so there's a lot of room for error. For example, if you know your Wilson relative was alive and well in Zanesville in 1950, you might be surprised when you can't find them in the index. That's because the OCR read the census taker's handwriting as "Nelson".
So be prepared. Do your homework now so you can get right to work on April 1. Happy hunting.
Typical OCR error rate is 40 errors on a page of 2000 characters. A lot of mistakes will be made with this indexing style. Thank you Mari for pointing this out and sharing with us some ideas to combat errors.
ReplyDeleteGoing a little further to highlight how many errors are likely: The population of the US in 1950 was just over 151 million. The average length of a person's name is 12 characters. That equals out to be 1,812,000,000 characters in the names alone. Using the 2% failure rate I mentioned above there will be just over 36 million mistaken characters in the names alone.
Delete