“Transparency”: Weathercaching

 

One of the first projects done partly with the Zygomatica team is a prime example of what we are calling “transparency” projects. The idea combines geocaching and open weather data to create what we called “weathercaching“.

The concept is simple indeed:  Let’s enhance geocaching so that you get more points for finding a cache in really horrible weather.

There are two catches that made this a very difficult project indeed:

1) How do you actually define “horrible weather”? We wanted something that would be global, unambiguous, and based on valid physiology and meteorology. We further wanted to define this “horribleness” as a single weather factor (W) between 0.5 and 5, analogous to the difficulty/terrain (D/T) points in normal geocaching.

2) How can you measure the “horribleness” of the weather automatically? Having the user measure the weather at the cache location would sound like a “trivial” solution, but this was felt to be a cop-out; the technological question only becomes interesting if open weather data are used.

Full analysis: A full analysis is on the project page.  (Note that the text is rather academical and dry).

Summary: There are “weather corridors” along Finnish highways which have enough weather stations to allow sufficiently accurate monitoring of the weather (see map below, adapted from www.geocaching.com). Even more importantly, the majority of caches are within these corridors.  Problem 2 is therefore technically solvable. However, we could not find a reasonable solution for problem 1. Meteorology has good parameters to define hazardous weather; it does not have any tools to define miserable weather. Also, there is no unambiguous way to determine W from the weather data; misery is culturally defined.

Conclusion: The technology and data sources exist. The specific application itself is, however, not worth implementing, and no demonstrator was made. Other uses for the weather data might well be possible.

Team:  Jakke Mäkelä, Pertti Sundquist, Gavin Treadgold, Kalle Pietilä, Niko Porjo.

Map adapted from www.geocaching.com

“Opacity”: Search engines / Hakukoneet

 

How to use small languages to study search engines //// Kuinka pieniä kieliä voisi hyödyntää hakukonetutkimuksessa?

[English text in normal font / suomenkielinen teksti vinofontilla]

Search engines is are very opaque; it is difficult to know what to is happening, how to study it, and how to interpret the results. Even in a field field filled with more professional researchers, we feel there are some niches to be explored. We have currently focused on Finnish, as it provides an interesting “laboratory”: a small language with a unique grammar in a high-tech and highly networked country. The amount of raw material is huge.

Hakukoneet ovat käytännössä läpinäkymättömiä: on vaikea tietää mitä tapahtuu, miksi, miten sitä pitäisi tutkia, ja miten tulkita. Alue on täynnä tutkimusta, mutta uskomme löytävämme pieniä erikoisalueita itsellemme. Keskitymme tässä vaiheessa suomen kieleen, koska Suomi on loistava “laboratorio”: pieni ja kummallinen kieli kehittyneessä ja verkostoituneessa maassa. Raakamateriaalia on valtavasti. 

#1/2012: “Onko Google ainoa käyttökelpoinen hakukone suomen kielellä?”[Is Google the only usable search engine in Finnish?]

Täysi raportti / full report (Finnish): Mäkelä et al- Suomalainen Bing_Google 2012- raportti

Haluamme tutkia, onko totta että “Google on ainoa käyttökelpoinen hakukone suomen kielellä”. Tilastojen valossa näin todella on; Googlen osuus Suomessa on noin 98%.  Tämä on käytännössä monopoli, ja sille on syytä etsiä syitä. Lausetta e voitu tutkia analyyttisesti, joten kysymyksenasettelu rajattiin seuraavasti: “Google on merkittävästi parempi hakukone kuin Bing suomen kielellä haettaessa”. Tutkimuksessa vertailtiin osumamääriä, jotka saatiin kun tiettyjä hakusanoja laitettiin Googlen ja Bingin suomalaisversioihin. Todettiin, että Bing palauttaa merkittävästi vähemmän tuloksia kuin Google, keskimäärin alle 10% Googlen osumista. Lisäksi vaikuttaa siltä, että Google reagoi nopeammin nouseviin uutisaiheisiin. Suomen kielen erikoispiirteistä löytyy ainakin kaksi ilmiötä, jotka vaikuttavat hakuihin. Google korvaa skandinaaviset kirjaimet (ä,ö) systemaattisesti yleisesti käytetyillä vastineilla (a,o). Bing sen sijaan ei toimi yhtä systemaattisesti, ja tältä osin voidaan sanoa että Bingin haku ei toimi ainakaan niin kuin on totuttu. Suomen kielessä tavalliset yhdyssanat tuottavat molemmille hakukoneille lieviä ongelmia.Tulokset eivät suoraan kerro mitään hakukoneiden laadusta. Osumien määrä on kuitenkin se subjektiivinen mittari, jota uskomme useimpien käyttävän  määrittelemään kuinka “hyvä” hakukone on. Tällä mittarilla Bing jääkin dramaattisesti jälkeen Googlesta. Lisäksi skandien käsittely toimii Googlessa johdonmukaisemmin. Vaatisi tarkempaa sisältöanalyysiä jotta voitaisiin arvioida onko Google “oikeasti” parempi hakukone; näiden tulosten perusteella on kuitenkin helppo ymmärtää, miksi yleisö näin ajattelee.Googlen osuus maailmanlaajuisesti on noin 90%. Muutamaa poikkeusta lukuunottamatta se on kaikissa Euroopan maissa yli 90%, usein yli 96%. (Vertailun vuoksi USA:ssa osuus on 80%, Venäjällä n 60%, Kiinassa n 30%). Vastaava tutkimus olisi siis hyödyllistä tehdä myös muilla pienillä kielillä.

English summary: We studied the statement “Google is the only feasible search engine for searches in Finnish”. The claim is supported by the 98% market share Google has in Finland. To analyze the question, we studied results from searches made in Finnish by Google and Bing (which with Yahoo the only credible alternative). We found that in terms of number of hits, Google is crushingly dominant, with Bing finding typically less than 10% of the results. Bing seems especially “slow” in finding trending news, which is a serious drawback for a search engine. It is apparent that Google is reasonably well optimized for some quirks of the Finnish language, while Bing is not. The clearest difference is in the processing of Scandinavian characters (ä,ö), where Bing’s performance is unpredictable. Both search engines have some problems with another Finnish quirk, compound words, but neither is clearly superior. Other potential differences were found relating to the agglutinative character of Finnish grammar, but this could not be studied systematically so far. This study did not analyze the “true” quality of Bing vs Google searches at the content level. However, the statistical results alone are sufficient to explain why Bing is not generally considered a viable option in Finnish. Such dominance of a small language by a single search engine should be considered a national concern. The situation is very similar for other small European and other languages, and it is recommended that similar studies be performed in other countries.

 

“Creativity”: catapult camera

 

Catapult camera

The “catapult camera” is an example of a project that produced no useful outcome whatsoever. It is included here as an example of a solution that did not find a problem.

The idea was inspired by mast (telescope) camera systems that can be used to map and monitor for example disaster zones. Typical weights for such systems appear to be a few tens of kg, and are capable of supporting camera weights of 4 kg or more. Typical costs for commercial systems appear to be some thousands of EUR. Typical heights that can be reached 10 meters. There is a technology which is capable of reaching altitudes well above 10 meters: small remote- controlled aircraft (helicopters or gliders). These are however not cheap technologies, and are not necessarily very robust in extreme circumstances.

We proposed building a catapult which is capable of launching a camera up to about 40 meters altitude, taking images while it is in the air, and stitching a panorama image of the pictures.

Full report: Download: CatapultCamera-Final.pdf

Outcome: The solution has far to many issues to be useful in real life. Projectiles are likely to get lost or broken; the image quality is far too poor to be useful. Most problematically, the cost of radio-controlled drones is plummeting, and these will be more competitive in every imaginable way. The problem is valid and important, the solution is not.

Team: Jakke Mäkelä, Niko Porjo, Kalle Pietilä.

Translate »