"Are we there yet?"
Jul. 18th, 2025 10:12 am![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
![[community profile]](https://www.dreamwidth.org/img/silk/identity/community.png)
When will this project be done?
When the work is done. Can you help get it there? No one is paying anyone to do this, so it will take all of our efforts combined.
How can I help?
Several ways!
1) Offer to tag a tab. Any language we have works, though if you speak one other than English, that's first priority. The smaller languages often have smaller tabs, too, so you could be done in a very short time!
2) Tag Unknown groups. There are several on our Discord server that are waiting for a volunteer to read the messages or summary of them, and decide the best tag. This is very low time-commitment (you can do just one group or two as you have time!) but the efforts will add up over time. It's a great way to get started tagging if you find an entire tab to be intimidating.
3) Tag an Unknown tab. It would save me a lot of time if someone were willing to take a whole tab of Unknown groups and open the data to see what it is. We have a visual tutorial for Sylpheed, a simple freeware email client that will natively import the mbox files that are Yahoo's standard message format. It's not hard to use! You just need an actual computer (a phone or tablet won't work for this). And you could make use of the thread feature in our Unknown groups channel to enlist help if you're stuck on one.
4) Volunteer to be pinged to use Google Translate on messages for Unknown groups. Some groups are in languages we don't have a volunteer for, but which Google Translate will handle correctly. All you have to do here is copy/paste them into Google Translate and copy/paste the translated messages back into the thread, so that tagging volunteers can read them in English. Easy! It will save us so much time if we have someone to ping for this.
5) Help with languages we don't have a volunteer for. For instance, we currently have Romanian and Farsi groups waiting for someone to read them and summarize the content. (Indonesian and Arabic are likely to turn up and we currently have no volunteers for those two either.) Google Translate adds an additional step that slows things down. If you can read them natively and help, offer to be pinged whenever there are groups in your language.
6) Volunteer to reach out to find speakers of the more obscure languages. A number of groups were in languages that Google Translate does not have in its data banks; we cannot read these groups at all. In some cases, we're not even sure of the language identification. Someone has to find speakers of each of those languages to ask them to read the messages and summarize them so we can tag them.
If you're interested in getting this project complete and you haven't volunteered to help, please consider changing that, and finding one of the ways above that you can contribute to the project. :)