How startups are transforming cyber security with machine learning

Machine learning rewriting the rules of cyber security

The future of cyber security belongs to machine learning


Last week, Mountain View (California) based big data security startup DataVisor raised $14.5m Series A funding. Earlier, in September alone, data security startups collectively raised more than $100m in funding. Recently, Gartner estimated the worldwide spending on information security to surpass $100b by 2018. While there isn’t an iota of doubt that data security holds unparalleled significance for businesses (we witness several high profile and value breaches every year), what is extremely interesting is to witness the novelty of approaches and execution models that some of these startups are bringing forth. And perhaps that is one of the prime reasons that venture capitalists are betting big on these dark horses. In today’s post, we will analyze what exactly it is that is fueling this wave of promise in big data security startup segment.

Machine learning offers distinct advantages over conventional methods
Most traditional cyber security algorithms are reactive and rely on static knowledge, unable to address the prowess of modern multi-vector threat attacks. Machine learning (ML), by virtue of its predictive models, can identify patterns of suspicious behavior, detect risks, anticipate threats and negate the potential damage from ever-evolving cyber-attack strategies. That’s a one line summary of the utility of machine learning in cyber security. This implies that, potentially, applying ML can lead to increase in detection rates of cyber security threats. Also, automated analysis of large scale massive data can lend the much needed semblance of control over ever increasing data pool and its security analysis demands.

Behavioral predictive analytics is in focus
The focus of a number of these new age security startups are on behavioral analytics i.e. to track user access footprint and detect any abnormal event or pattern of activity. So much so, a number of CXOs of such startups go around claiming that their systems could have timely detected the activities of Edward Snowden. Since most of these detection algorithms are classification based, which have their training set unduly unbalanced towards negative samples, degree of accuracy of threat prediction depends a lot on the exact algorithm used, features extracted, scaling, normalization, sampling, etc. Consequently, each startup claims that its algorithm is unique, best-in-class and returns highest accuracy results. Here, a noteworthy aspect is that behavioral analytics is dependent on identity and access management systems (IAM). But most IAM system access decisions are determined by static rules, and if data security threat has to be neutralized with minimum damage, these IAM also need to be integrated with the ML systems.

Automation will also address increasing demand for security analysts
Rise in high-skilled manpower demands might help the transition to ML systems. It is a common question right now as to how far would be enterprises willing to trust the result from machine learning based algorithms. Currently, for most systems, ML systems output ‘warnings’ and the decision-making is left to the human hands. However, we know that demands of round the clock security, scarcity of expert decision-making security analysts and improvements in ML techniques will gradually strengthen the case of enterprise data security totally being taken over by ML systems, eliminating human intervention.

The process, though, has its fair share of challenges to negotiate
ML application in cyber security, though, is not without its set of challenges. Lack of enough real attack examples and verified normal behavioral patterns in any setup is a prime challenge. Data privacy issues will always remain a bone of contention. And an overwhelming increase in the number of alerts labelled as potential threats, demanding to constantly further scale the ML process, is also not helping the engineers develop a robust system.

Startups are leading the path to innovation in the segment
Some of the startups that are deploying ML to cyber security solutions include Caspida, CyActive, Praesidio, SparkCognition, Darktrace, Cybereason, SecBI, Exabeam, Fortscale, Vectra Networks, Prelert, Palerra, Niara,, Dtex Systems and Brighterion. Some of these have exclusive business model focus. For example, Caspida (acquired by Splunk), Fortscale, Niara and Dtex are focused on threats emanating from within corporate networks, CyActive (acquired by Paypal) employs bio-inspired algorithms to predict malware evolution, Praesidio focuses exclusively on the banking segment and SparkCognition focuses on equipment and assets of energy companies. Even traditional security software companies such as Trend Micro, Symantec and Intel Security (acquired McAfee) are now building ML offerings.

Mobile only security segment is also witnessing startup activity
Not only this, ML backed cyber security solutions for mobile too are gaining momentum. For example, San Francisco based Lookout is an exclusively mobile focused cyber security startup. It employs machine learning and contextual analytics to protect mobile devices from malicious applications. Its software analyzes an app’s executable code and the IP address, among other parameters, to develop a contextual understanding of its behavior. And then each app’s behavior is probed for suspicious activity in comparison with millions of data points Lookout has assembled from apps (carries pre-install its app Lookout Mobile Security) running on 100 million mobile devices. So far, it has raised $282m of funding in eight rounds.

And then, technology evolution will create newer challenges to overpower
To culminate, threats too will keep on evolving. Vicarious, a California-based machine learning startup, has claimed to have developed a software that can successfully interpret and reproduce the text inside the CAPTCHA image with 90% accuracy. Its algorithm is based on its proprietary visual perception system called Recursive Cortical Network (RCN) technology, and intends to mimic the human brain’s neo-cortex. The good news is that Vicarious does not plan to use this tool for any cyber breach. The company sees potential applications in the fields of medical analysis, image search, and robotics. But the truth is if Vicarious can, they others (potential hackers) too can develop similar systems. One of those things you don’t know whether to cheer about or not!

Anubhav, a data scientist, writes about new developments and future trends in the machine learning and data analytics domain.
He can be reached at
Follow him on Twitter at:

Leave a Reply