AntiNex 1.0.0¶
Date: 2018-03-30
I am looking forward to seeing what we can teach Deep Neural Networks about defending software from network exploits. I recently got pretty hooked into debugging malware, and that led me to building datasets for teaching DNN’s to predict an attack vs a non-attack record within the OSI network layers.
The first non-scaler datasets results were not very great and had a wide variance between 66-89% predictive accuracy.
Well as I kept trying different DNN configurations and datasets (and wondering why my xgboost scores were always better), I found out I needed to bound my data in a known range using an sklearn.preprocessing.MinMaxScaler. Wow… I was pretty shocked at how fast the accuracies shot up from there. With just a wide, two-layer DNN the same Keras and Tensorflow models are now repeatedly predicting attacks vs non-attack network traffic with over 99.7% accuracy. In fact, all of the AntiNex datasets including: Django, Flask, React + Redux, Vue, and Spring can be used to train a DNN within AntiNex to predict an attack with over 99.7% accuracy. I decided this was good enough to build a data pipeline around and made it auto-scale the data before training and making live predictions. By default, AntiNex now automatically converts datasets using this scaler normalization to bound all data into a range between [-1.0, 1.0]. This appears to not only work on classification problems, but for predicting stock prices using regression as well.
I have put the technical docs together on how AntiNex works:
http://antinex.readthedocs.io/en/latest/
Here are two Jupyter notebooks for reviewing the process, analysis and validation of the scores:
Standalone Jupyter Notebook without any AntiNex components¶
Using Pre-trained Deep Neural Networks in a Jupyter Notebook¶
I have built the AntiNex v1 stack to run using docker-compose, but I will be exploring using kubernetes to host the containerized, distributed artificial intelligence stack next.
For those curious about how I found out that I needed to use scalers: I ended up trying to include volume within my regression stock price predictions while I was benchmarking scipype results with these new, terrible DNN results. I kept removing column after column until I noticed that having volume in the dataset caused my price predictions to be terrible… and the rest is what you see in AntiNex.
Thanks for reading!¶
Until next time,
Jay