Custom Dashboard Development for Federal Agency

 

The Challenge:

In 2021, one of the biggest financial stories was the “short squeeze” applied to the hedge funds by retail investors impacting GameStop’s stock price. The company was a nearly defunct brick-and-mortar video game retailer that quickly became the center of the financial world when its stock rose from $20 per share to a high of $450 over a 3-day span.

In essence, Internet retail investors publicly joined together in an attempt to undermine a hedge fund that had shorted the GameStop stock, which resulted in the Robinhood Financial, LLC trading platform to restrict the purchase orders of the stock. Social media platforms and websites like Reddit played a major role as communication channels for manipulating the market and a public outcry resulted.

For one government agency this presents a growing challenge. In response, Makpar implemented a Proof of Concept that aligns with the agency’s strategic goal to “expand market knowledge and oversight capabilities to identify, understand, analyze, and respond effectively to market developments and risks.”

 

The Solution:

Over a two-week Sprint, Makpar’s Innovation Lab team developed a Fraud Detection Analytics Dashboard Proof of Concept using a graph database to examine overall investor sentiment in social media sites such as Twitter and Facebook, and Reddit.  

Relational databases store highly structured data that require compute-heavy and memory-intensive join operations to associate data. Unlike relational databases, graph databases are not required to infer connections between entities, dramatically reducing the development time needed for this type of analysis. This enabled the Makpar team to download an extensive amount of social media data to analyze relationships between those discussing the GameStop stock.

This effort also demonstrated how particular social media users generating a wide-range of comments about GameStop were actually bots. For agencies that need to direct enforcement towards users that use fake social media accounts in an attempt to effect financial markets, the ability to glean this kind of information can provide tremendous cost savings in Internet forensic efforts.

 
 

Tools Used:

Makar tested a wide-range of tools in its Innovation Lab sandbox to determine the most appropriate options for this particular use case.

A). Neo4j Graph Database: Neo4j is a highly scalable native graph database, purpose-built to leverage not only data but also data relationships. Powered by a native graph storage and processing engine, Neo4j delivers an intuitive, flexible and secure database for unique, actionable insights.

B). Extraction of data went through several test phases:

  • Twitter – Selenium, BeautifulSoup, and direct interaction with the Twitter API.

  • Reddit – Python in the order of PRAW, PSAW and PMAW.

  • YouTube Comments –  Using Selenium as YouTube is JavaScript-rendered

  • Financial data was pulled through direct API interaction and BeautifulSoup.

  • News articles were scraped using the Python library news-fetch.

C). Data cleaning and semantic classification were performed using Python (Pandas, RegEX, NLTK) and R (OpenNLP, Quanteda, TM, koRpus, RWeka)

D). Dashboarding platforms evaluated

E). Tableau

F). QlikView

G). Amazon Quicksight

H). Neo4j Bloom

 

Best Practices:

  • Makpar utilized industry best practices for a highly scalable, secure, modular, flexible, and cost-effective architecture, which enables the agency to quickly prototype concepts within a defined target architecture and develop iterative building blocks of capabilities.

  • Leverage cloud native services for rapid prototype development.

  • Microservices-based architecture to handle technology evolution now and in the future. Enables the agency to evolve the architecture with the business needs

  • The Makpar team also documented all of the lessons learned, which will ultimately become a full-stack, machine learning capability that can be used in any situation where data relational analysis is required. The R&D effort is helping Makpar to further advance its ability to help agencies leverage next-generation IT solutions for enabling mission success.

 

Benefits to Federal Agency:

The true benefit is that the agency is now able to truly analyze a large amount of social media data in ways that can help with any kind of enforcement. 

In the case of GameStop, this type of approach will help the agency to gain a comprehensive understanding of social media sentiment to predict when another stock “pump and dump” situation will occur. It will also enhance the speed of enforcement by identifying the “bad actors” that are taking part in these schemes.

 

GettyImages-1225009218.jpg
 
iStock-1248262649.jpg
 

Key Takeaways:

  1. Allowing agencies to leverage data and insights to prevent future stock “pump and dump” schemes.

  2. Bringing a DevOps capability to business intelligence allows for faster development flow to meet urgent agency needs.

  3. Analyzing social media relations with graph databases for gaining a clear perspective on online sentiment.