Conserving Linguistic Heritage the FOSS way...

Hello! I am Omshivaprakash I’m a Bengaluru based Wikimedian and
a FOSS contributor. I’m here to share my experience helping reuse/conserve the linguistic heritage of Kannada the FOSS way!

2013-14 Vachana Sanchaya 11th and 12th Century literature & the
need of the hour...

‘’ We need to be able to research on Vachana
Sahitya. We should be able to search Vachana’s on the NET. We need data to understand Sahitya much better. - Sri OL Nagabhushana Swamy - Sri Vasudendra

Challenges ▣ ANSI Data available on GoK Website ▣ GOK
website not being intuitive ▣ 15 large volumes Printed Books + others ▣ No real tool to analyze the data at fingertips ▣ Hot discussions on public forums needed concordance & numerical data to debate on literature Researches wanted data authentically come to consensus via research… but how?

Digitize in Unicode Idea was to get hands on the
digitized data in a reusable format & in Unicode

Scrape We found that the data was available in digital
format on GoK website http: //vachanasahitya.gov.in but in ANSI format. We pulled the data with wget and write a python script to systematically extract data and converted the text to Unicode. ALL IN FLAT FILES Getting to work on data But... It was not really enough. How does anyone take all the text in files and do research? We proposed to push this to a database and provide simple GUI tools to search text to look at results.

more challenges... Technical difficulties Providing the end results to large
number of people. Making them understand to use the tools such as MySQL WorkBench/ SQLite Manager etc... Awareness Text input methods SQL syntax OS compatibility Expanding scope What about other research requirements? How many queries we can write and keep sharing with the linguists not the computer savvy people?

An opportunity to build something For language that is close
to our heart with few like minded people around over a cup of coffee, during weekends, whenever we have sometime to scribble through the need of our people… IT WAS FUN...

We built Vachana Sanchaya http://vachana.sanchaya.net

Portal for linguistic research

Visualization, Discussion board, Concordance & more...

Enable everyone students Researchers Common Man

To unearth the wealth of literature ▣ by reading and
searching through 21 thousand Vachana’s ▣ written by 250 Vachanakaara’s ▣ Researching in finger tips via Concordance & quick visualizations ▣ Building corpus of 2lac+ unique words ▣ Building biodata of all male & female vachanakaaras ▣ enabling crowd sourced review solution ▣ opening up new possibilities for Linguistic research across other literary work of Kannada.

We reached masses across the world...

FOSS All because of the FOSS tools around us and
its philosophy that we believed in...

Rails, Nginx, Passenger, Memcached, MySQL, Python, Gitlab, wordpress & more...
Only server cost to keep it running Localized & being adopted to other projects too... It is being reviewed to be contributed to Wiki Source & Wikipedia

Moving forward Bring more literary works online Standardize Research platform
for language Create timeline for Centuries of Heritage

How we are planning to do this? Collaboration Enable community
collaboration to build research documents around our literary heritage Engage Engage students and others to work together on our code to build robust and futuristic tools for all type of literary works(Text, Poems, Old Kannada) etc Evolve Evolve over period of time, adopt learnings from mistakes, reviews and feedbacks Consult with communities We would like to consult and learn from multiple language communities. Because Vachana Sahitya is translated to more than 15 languages & more Keep tweaking We keep working on tweaking the tool and make it robust to be used as a platform for our upcoming projects Reaching goals We are determined to reach our goal of building unified search tool with timeline for centuries of Kannada Literature the FOSS way...

We are on Social Media - FB/Twitter/Google+ Embed us on
Wordpress via Plugin We will be on Mobile Soon… We are opening up APIs to reuse data or build tools around Kannada literature Adding English and other translated works too.... There is lot more to share So, Keep in touch!!!

Our Team Pavithra, Myself, OLN, Vasudendra, Devaraj

Thanks! Any questions? You can find me at: Kn/En Wiki:
User:Omshivaprakash Project Page: http://vachana.sanchaya.net Main Project: http://kannada.sanchaya.net @omshivaprakash | @vachanasanchaya

Credits Special thanks to all the people who made and
released these awesome resources for free: ▣ Team photo by Amit Mrugvadhe ▣ To my team for having made this possible ▣ Minicons by Webalys ▣ Presentation template by SlidesCarnival ▣ Photographs by Unsplash

Conserving Linguistic Heritage the FOSS way...

Conserving Linguistic Heritage the FOSS way...

omshivaprakash

More Decks by omshivaprakash

Other Decks in Technology

Featured

Transcript