The consequences of that what happens in parliament are far reaching. Therefore the Austrian Parliament published all stenographic minutes of its meetings. Unfortunately, what is being discussed in Parliament, does only seldom reach a broad audience. Partially this is because a lof of the talks are political skirmishes. But partially also due to the fact that there is simply so much being discussed and media does not have the resources to cover and analyze it all.
Analysis of large sets of information has increasingly being outsourced to computers and algorithms. But analysis of spoken (or written) content is much more complex that that of numbers. We accepted the challenge to analyze the sentiment of Austrian representatives - and have come up with the first results.
The minutes of the Austrian Parliament are published on published on its homepage. The initiative "Open Parliament" is working on making this information more readily available to the public. Our goal was to analyze these minutes in depth. For this a broad set of skills and know-how is needed. Within the research project VALiD, a cooperation between the University of Applied Science St.Pölten, the University of Applied the Vienna University and the drahtwarenhandlung, just this skills are combined. The aim of this project is to analyze information over the course of time. For this we developed Natural Language Programming (NLP) algorithms which we then applied onto the parliamentary speeches of the past 20 years. This gives us the ability to not only count "words" (e.g. which party has the most interjections, which representative is talking the most, when are what topics being discussed) but to analyze content.
Together with political scientists (also from Vienna University) we developed algorithms which are trained to identify negativity. I.e. classify sentences according to their negativity between 0 (not negative) and 1 (very negative). This process, called sentiment analysis, is a very complex one, as language is much more complicated to understand than numbers (e.g. sarcasm). In addition the German language is even more complex than the English - for which the majority of NLP theory exists. Also we had to find out that even humans cannot reliably agree on what "negative" is. Frequently different individuals rated the same sentence with not negative and very negative.
But soon we could show that out classifier worked not only stable, but also produced meaningful results. In the meantime these have been presented at the ECPR-confernce (European Consortium for Political Research) in Nottingham with a second paper already accepted for the ICA in San Diego (International Communication Association).
The first theses we tested were very simple. The idea was to use this thesis for verification of the algorithm. For example that Ewald Stadler's speeches, a notorious politician, will rank among the negative ones. Or that opposition parties will be more negative than the government. For both our classifying algorithm showed positive results. Ewald Stalder did not only make it into the Top 10 od negative politicians. He made it into the Top 10 twice (note: he changed party affiliation during his career).
Next it was about testing more complex theses. One was the change of negativity over the course of time. The thesis was that negativity increases prior to elections and decreases during the period of governance. This was as well supported by the algorithm. Hence it was now to use the algorithm to really analyze the data.
One of the findings was that the speeches of the SPÖ (social democratic party, left wing) significantly increased in negativity during the time as opposition party (2000 - 2008), only to drop to pre-opposition levels once again in government. The FPÖ (freedom party, right wing), in contrast, did not diminish in negativity while in government during that time - but had a sharp increase when again in opposition.
But computational analysis can go even further. We compared the debates of the last decades and could show that it's predominantly the "big topics" which cause emotions (among politicians). Of the 25 most negative debates 15 can be assigned to only three topics: the financial crisis, the Eurofighter-purchasing and the early elections surrounding the FPÖ-ÖVP-BZÖ-coalition.
Even though this are only the first results which we could produce with our classification algorithms, they appear to be very well founded. And they hold quite a potential. In the next months we will further evaluate the algorithms and expand them. Our plan is to add other sentiment to the classifier and also to conduct more analysis. And, as we set out to develop a tool which serves public interests, we also intend to publish the findings in cooperation with media.
If you ike this project, we would be happy if you share it with friends. And if you have any ideas for further analysis of the data, we would love to hear from you!