Machine Learning can be applied to data originating from many different contexts nd one of them is politics. In Brazil, it is common for a new legislation proposal to generate many opinions for or against it, which echoes in social networks and in the media. Until the final decision is made, many factors are taken into account. There are various types of legislation proposals and they can be presented by different initiatives, follow diverse paths, go through many hands and, in general, are influenced by several different situations during the approval process. This generates a large amount of data that can be automatically and effectively analyzed with interesting results.
The potential to extract information and to identify patterns using IA, the availability of governmental information guaranteed by the Information Access Law and the various factors that can influence the approval or rejection of a legislation proposal were all motivations for Elcius Ferreira's Bachelor's Thesis, in which he applied the Random Forest method to proposals that passed or were still being processed in Brazil's Chamber of Deputies.
Using tools provided by the National Congress, we obtained proposals that had decisions (positive or not) between January 1st, 2011 and August 31st, 2016, corresponding to duration of President Dilma Rousseff's term. Another dataset was created by searching for propositions with decisions taken between August 31st, 2016 and August 16th, 2018, corresponding to the majority of President Michel Temer's term, which ended on December 31st, 2018.
Among the 11,039 proposals in President Dilma's dataset, 4.86% (537) were approved and 95.14% (10502) were rejected, while in President Temer's dataset, 19.23% (245) passed and 80.77% (1029) did not, out of a total of 1,274 legislation proposals.
To train the Random Forest models, we first processed any categorical attributes, such as Subject, Type of Proposal, Date, Author, etc Then we selected the most relevant attributes, which included Year, Sequential Number, Number of Related Proposals, Number of Updates, Proposal Type, Subject, Date and Type of Process.
We then trained predictive models using Random Forests, obtaining 98% test accuracy with the model trained on President Dilma's dataset and 94% test accuracy for the model trained using President Temer's dataset. Then both models were used to predict the approval chance of new proposals which were considered top priority by President Temer's government. Out of 12 such proposals, four were already decided and our models correctly predicted three of them.
Of the 12 propositions, LP 8456/2017, which had 62% chance of approval predicted by the model trained on President Temer's dataset, passed and became Ordinary Law 13670/2018. Additionally, LPs 9327/2017 and 3453/2015, 95% and 59% predicted approval chance respectively, were approved in the Chamber of Deputies and were taken to the Federal Senate for further consideration. Their decisions in the Chamber agreed with what the model predicted.
MPV 830/2018 was rejected and archived by the Chamber of Deputies, which disagreed with the 88% chance of approval predicted by President Temer's model. Outside factors, such as political articulations and pressure from the media/population, which were not mapped in its attributes likely influenced the decision process.
All other proposals in the table below were still being processed during the time this work was developed.