Multimodal Machine Learning for Major League Baseball Playoff Prediction

Aliaa Saad Yaseen, Ali Fadhil Marhoon, Sarmad Asaad Saleem


The introduction on sabermetrics has changed the way Major League Baseball (MLB) teams valued their players. Since then, new baseball stats have been made to make various predictions for MLB teams. With the immense amount of data on baseball players, teams, and scores. Using various Supervised machine learning algorithms, we plan to see how well we can accurately predict which teams will make it to the playoff for year 2019. For this research, we have gathered data from the last 20 years. The features that we will utilize for our machine learning algorithm includes Runs, Batting Average, Homeruns, Strikeouts, Innings Pitched, Earned Runs, and Earned Runs average. We decided to use a Logistic Regression model and a Support Vector Classifier (SVC) as the two machine learning algorithms for our features. After running our tests, our models showed that our trained algorithms were only able to predict accurately 77% of the teams correctly. Of those 77% accurately predicted, 59% was recalled correctly. This led to our overall projected model being only 60% accurate. As the projected model was only able to correctly predict 6 out of 10 teams that made the 2019 playoffs. We believed that we could improve upon our findings by using other machine learning algorithms or including more features that thus increase the overall accuracy of our training model.

Full Text:



"2019 MLB Team Statistics," 16 March 2020. [Online]. Available: [Accessed 17 March 2020].

Adams, Mark. “The Man Behind Moneyball: The Billy Beane Story: Domo.” Connecting Your Data, Systems & People, Domo, 24 Feb. 2015,

"A Guide to Sabermetric Research," [Online]. Available:

Blackburn, Ghoji. “What Is Fantasy Baseball? How Do I Play It?” Fake Teams, Fake Teams, 16 Mar. 2017,

D. Prasetio and D. Harlili, "Predicting football match results with logistic regression," 2016 International Conference On Advanced Informatics: Concepts, Theory And Application (ICAICTA), George Town, 2016, pp. 1-5.

J. Bean, "Modeling MLB's 2018 Playoff Teams," 9 October 2018. [Online]. Available: [Accessed 17 March 2020].

J. Bean, "Modeling MLB's 2018 Playoff Teams," 9 October 2018. [Online]. Available: [Accessed 17 March 2020].

J. Dutcher, "Book Review: Moneyball: The Art of Winning an Unfair Game," 28 March 2014. [Online]. Available:

J. Silverman, "How Sabermetrics Works," 21 January 2009. [Online]. Available:

K. Fuchs, "Machine Learning: Classification Models," 28 March 2017. [Online]. Available: [Accessed 17 March 2020].

Lashbrook, Lynn. “Why Baseball Analytics Matters and How You Can Make It into a Career.” Why Baseball Analytics Matters and How You Can Make It into a Career, SportsManagementWorldwide, 20 Jan. 2017,

“List of Major League Baseball Postseason Teams.” Wikipedia, Wikimedia Foundation, 1 Nov. 2019,

Lutins, Evan. “Grid Searching in Machine Learning: Quick Explanation and Python Implementation.” Medium, Medium, 5 Sept. 2017,

“Major League Baseball Team Win Totals.” Baseball, Baseball-Reference, “Using Machine Learning to Predict Baseball Hall of Famers.” Baseball Data Science, 27 Sept. 2017,

“Moneyball.” Moneyball (2011),, 23 Sept. 2011,

N. Paine, "The Imperfect Pursuit of a Perfect Baseball Forecast," 27 March 2014. [Online]. Available:

Pharr, Roger D. “Predicting MLB Game Outcomes with Machine Learning.” Medium, Towards Data Science, 3 Aug. 2019,

Raschka, Sebastian. “Predictive Modeling, Supervised Machine Learning, and Pattern Classification.” Dr. Sebastian Raschka, 25 Aug. 2014,

R. Ribeiro, "Houston Astros Strive for Balance Between Quantitative and Qualitative Data Analytics," 3 July 2014. [Online]. Available:

S. Banerjee, "Linear Regression: Moneyball - Part 1," 15 April 2018. [Online]. Available:

S. Banerjee, "towardsdatascience," 1 June 2018. [Online]. Available:


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.