The Race Is On — second and third place Grand Prize ($100k, $50k), four new $5,000 Open Source Prizes, and more!

We’re going to read the scrolls.

Oct 20, 2023

Last week, we all saw the results of the First Letters Prize, and the subsequent worldwide news coverage, such as this excellent piece in The Economist. For a deeper dive, we also did an Ask Me Anything (AMA) with Nat, the team, and contestants on X Spaces.

Welcome to everyone who is joining us now! :)

With Luke’s and Youssef’s code being public, it’s a wide open field for anyone going after the Grand Prize. The deadline is almost 10 weeks away, so we’re making some announcements to hurry things along even further. LET’S MOVE FAST AND READ SCROLLS!

*Youssef’s latest published results (using his private model).*

New donations

To date the Vesuvius Challenge has spent about $500,000 in prizes and operational costs. With the grand prize still outstanding, we were very lucky to receive a ton of new donations, which will enable us to scan more scrolls, make more segments, and award more prizes! Thanks so much to these generous new donors:

Bastian Lehmann (another $50,000)
Akshay Kothari ($25,000)
Anjney Midha ($25,000)
Mark Cummins ($25,000)
Jamie Cox & Gary Wu ($15,000)
Mike Mignano ($15,000)
Brandon Silverman ($10,000)
Katsuya Noguchi ($10,000)
Aravind Srinivas ($10,000)
Shariq Hashme ($10,000)
Sahil Chaudhary ($10,000)
Matias Nisenson ($10,000)

And finally, Nat Friedman and Daniel Gross increased their own donations by $100,000 each to cover ongoing operational costs. We’ve raised about 1.8 million dollars in total now. We’re honestly blown away by all the support and passion we’ve received. Thank you.

Second and third place Grand Prize

With the new donations, we are introducing a second place Grand Prize of $100,000, and a third place of $50,000. These will be awarded to the second and third teams that make a qualifying submission before the December 31st deadline. This makes the Grand Prize a bit less of an all-or-nothing proposition.

We’ve also clarified some of the criteria of the Grand Prize. The new criteria read:

A Review Team made up of technical experts and papyrologists will assess all Grand Prize submissions to ensure that they can:

Read at least 4 passages from the available full-scroll data, each containing at least 140 characters of contiguous text (e.g. within the same column)
Verify that each passage contains no more than 15% of characters which are missing or illegible
The 140 characters per passage include the 15% of characters which may be missing or illegible, so 119 characters must be legible. Legible characters only count as legible when identified on a letter-by-letter basis without papyrological interpolation.
Confirm that submissions contain legitimate and linguistically plausible text.
Independently reproduce and verify your results using your code and documented techniques.

If no team meets the criteria by the deadline, we reserve the right to award the prizes to the teams that came closest. This is not a guarantee — we will only award prizes if we believe the spirit of the prize has substantially been met and if a submission comes very close to the objective threshold. This is entirely at our discretion. If you are very very close to meeting the bar, we encourage you to submit your work before the deadline.

We hope all of these changes make the Grand Prize more objective, fair, and attractive. We can’t wait for someone to win it!

Four new $5,000 Open Source Prizes by Nov 30th

In our tradition of progress prizes, we’re awarding four $5,000 prizes for qualifying submissions by Nov 30th. This time, we do not award prizes specific to segmentation or ink detection. Anything that increases the probability of reading the scrolls this year qualifies.

There are some conditions:

Your submission must substantially increase the probability of reading the scrolls this year, as judged by the Review Team. We may award more or fewer prizes at our discretion, depending on the number of qualifying submissions.
Your submission must be open source.
We are heavily favoring submissions that:
- Are released early. Tools released tomorrow have a higher chance of being used for reading the scrolls than those released a day before the deadline.
- Actually get used. We’ll look for signals from the community: questions, comments, bug reports, feature requests. Our Segmentation Team will publicly provide comments on tools they use.
Submissions are closed on Thursday November 30th 11:59pm PT, after which the Review Team will select winners.

Segmentation is still a big focus, since for the Grand Prize we need multiple large segments. Large enough to contain continuous passages of at least 140 characters.

Some ideas to get you started:

The Segmentation Team has put together a list of feature requests — be sure to tag @Hari_Seldon on Discord if you’re going to work on this. These include:
- Many feature requests for Volume Cartographer (we’ve been using @RICHI’s fork)
- Adding metadata of which areas were auto-segmented vs manually adjusted
- Annotating which areas were hard to segment or uncertain
- Integration of ink detection models to help see if an area was segmented correctly
- Real time papyrus visualization like Khartes
- Segmenting “at an angle” for tilted papyrus surfaces
Tools for browsing all segments and viewing open source model outputs on them.
Tools for detecting overlap in segments, to avoid unwittingly running training and inference on the same data.
Tools for merging segments or laying out multiple related segments in 2D (like Marzia D’Angelo’s fragment map; more explanation here).
Auto-segmentation of “mushy” areas: algorithms / models that can detect and follow the papyrus fibers, even where they are bunched up with other layers of papyrus.
Comparison of various ink detection model architectures.
Better tools for annotating ink in segments, e.g. by manually finding “crackle” or by reinforcing model outputs.

*Marzia D’Angelo showing her fragment map — can we make something like this for our segments?*

Weekly segment releases on Fridays noon PT

Now that we’re getting closer to the Grand Prize, a single segment can make the difference for having enough passages to make a submission. From now on, we’ll release new segments on Fridays at noon PT, so it’s predictable for everyone. We will update the Segmentation Directory 24 hours beforehand (noon on Thursdays) with segments that we will publish.

We’ll try to release segments even when they’re not done yet, so everyone can already run their models on them. When we later release larger versions of the same segments, we’ll increment the final digit of the segment ID by 1. We’ll start new segments off ending in 0. We’ll also add a “_superseded” suffix to the relevant segment folders on the data server.

We’re also doing away with the /hari-seldon-uploads/team-finished-paths/ directory. All segments will be released to /full-scrolls/Scroll{1,2}.volpkg/paths/. Existing segments will also be moved there.

We will also update Volume Viewer and Segment Viewer at the same time.