Update 2020-04-27
Given submissions have tapered off we've taken down the server. Thanks to all that have contributed so far! We'll let everyone know when more information is available.
Update 2020-04-26
Thanks to all of those who have contributed! We've been running for about a week, and results have leveled off around 80% complete and currently around 84 complete%. We've now have some statistics and a few proposals.
Some basic statistics:
- Pages: 297
- Lines: 19510
- Page submissions: 744
- Line changes (roughly): 10297
- Change 2/3 agree: 9191
- Can't 2/3 agree: 1106
A few more advanced heuristics were also tried (ex: partial line matching, weighting user results based on how much we trust their results), but ultimately wasn't convinced any of these are the right approach.
So, where does this leave things? Two main options are being considered:
- Push the annotated source to github or gitlab as is. We estimate that it would take someone about 6-12 hours to fix, which is not intractable. Default would have been the furby-source repository on github, but they have stopped responding
- Restart the crowdsource server using the best result with annotated conflicts. Users would need to delete the extra lines and submit. However, we suspect users need a break, so at a minimum we would probably hold off a few months to regain momentum
Note we suspect additional fixes will be required upon eventual manual review, whichever path is taken. Generally the first option seems like the best. A few dedicated users could knock this out fairly quickly without too much coordination. If we get a few volunteers (or one very dedicated volunteer), we'll figure out where to push this and move the project forward. Ideally one of these people would also be interested in coordinating other community contributions.
So we're asking if people are interested in the first option and we'll likely default to the second if we don't get traction. Please let us known here in the comments or on Twitter!
Update 2020-04-20
Higher quality .pngs have been swapped in after reports that compression is swapping letters (!). Special thanks to Video Game Preservation Collective for the above image! The old set was from the text annotated version while the new set is believed to be the original scan. Unfortunately these images are about 5x larger, but should improve accuracy.
Also now we've done a very crude analysis of the existing submits and used them to make a quick guess at better default text to present. This effects about 85% of entries. So going forward you'll typically get higher quality defaults. But please still be attentive and look for errors!
There have also been a few backend tweaks, notably favoring showing pages with fewer submissions. However these generally should not be visible externally.
Update 2020-04-19
We're up to 197 submissions! Thanks to all of you that have posted so far! We need to meet a minimum of 297, so we're making great progress. Our goal is to get 3 submissions to help correct errors, for a total of 891.
We will briefly bring down the site for maintenance at 2020-04-21 6:00 AM. We will use this window to improve the default text based on submissions so far. This should make challenges much easier as mostly you'll only need to do small corrections instead of large edits. We will also fix the overall progress indicator, which currently says 1485 required, but it should be 891.
Once again, thanks for your help and please let us know if you have any feedback!
Micro update: the progress indicator fix has been pushed out (it was not necessary to bring the server down)
Micro update: the progress indicator fix has been pushed out (it was not necessary to bring the server down)
Background
The Furby is an iconic talking toy from the late 90s. A couple of years ago scans of the original Furby source code were acquired. Unfortunately the scans are noisy and automatic image to text conversion is difficult. So we're asking the community to help preserve game history by proofreading computer generated transcripts. Generating a proper copy of the Furby source code will be enormously valuable to understanding how it works!Project TLDR:
- Complete using your web browser
- You need a large screen (laptop or desktop)
- Scanned image at left, noisy text interpretation at right
- Fix errors in the image to text translation and submit
- Remove headers and footers (ex: "Page 6", "A-121", "Diag7.asm" )
- Unreadable: put best guess if possible, or random characters as last resort (will flag for review)
Although the crowdsourcing system wasn't a good fit for Great Swordsman, it spurred some conversations on what it could be used for. It has been revived and adapted to work on improving pdf image to text conversion.
Join the effort by signing up for an account! If you had an account on the previous TGP project, it likely is still available. Additional instructions are available after creating an account. If you have some time, please try a few images!
Finally, the person who gets the most pages accepted (ie with acceptable accuracy) will get early blog access for 3 months! Note however you must provide your e-mail address to qualify so that we can actually send it to you.
Sounds good? Sign up here! Instructions are available after logging in.
Note: due to various issues we are unable to split the pages into smaller tasks. So the images are relatively large and this is best completed on systems with a large screen such as a laptop or a desktop. So apologies if you only have mobile, but you may not be able to help with this specific project.
Special thanks to Andrew Gardner for writing the original tool and John McMaster for recent modifications!
FAQ
We'd also love if you have suggestions for improving the work flow. These are things already on our mind:
Q: What happened after the last crowd sourcing project? (Fujitsu DSPs / TGPs)
A: Post processing took a while, but it ultimately led to massive improvements on how well the community understands these games. However we've been doing a poor job at communicating those results and still need to write a post about it. See for example this MAME post which mentions recovering "...the Sega Model 1 coprocessor TGP programs for Star Wars Arcade and Wing War, making these games fully playable."
A: Post processing took a while, but it ultimately led to massive improvements on how well the community understands these games. However we've been doing a poor job at communicating those results and still need to write a post about it. See for example this MAME post which mentions recovering "...the Sega Model 1 coprocessor TGP programs for Star Wars Arcade and Wing War, making these games fully playable."
A: Not easily. The pages aren't well aligned, we'd need to both figure out correct straightening and cropping
Q: Can you align the text editor to the images better? Maybe rich text features like find and replace?
A: While the chip community can unlock the secrets of the micro universe, we can't code websites for beans. Really it's a miracle that the site is running at all. If you can help with improving text entry, please reach out! FYI its written in Python/Django and could use some cleanup. If you haven't been scared off, more info is here
Q: What happens after its captured?
A: First we'll post process to remove errors. After that we'll use the CPU manual to make a special 6502 assembler to create a binary. Ideally we'll also combine this with the Furby 70-800 ROM microscope images (sample above) at some point.
Q: Where did the source come from?
A: Not sure exactly, but some information is available at the Internet Archive
Q: Can I edit my result after submission?
A: It is not possible to modify it at this time. But don't worry, most of the time we can detect errors by combining a few results.
Q: Can you reset my password?
A: Yes, but it requires manual admin intervention. We suggest creating a new account if you aren't really tied to your old one
Q: Isn't that Furby image for the Furby 2012, not the original Furby?
A: Maybe... Actually we have a 70-800 image now
Prologue
More questions? Type them below, or reach out to us on Twitter. Thanks again for your help!