Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it..
The biggest companies are trying to find out and help the company figure out where the intersection is between what people want, and what the company wants people to buy.
In this context, Andre Fatala and Renato Pedigoni, bring a case study from Magazine Luiza.
To work with persistent grafos, it was decided between two options: neo4j and TITAN, and based on CAP, they decided to work with HTML_Link_4.
To interact with the graph, they chose Gremlin, which is a DSL part of TinkerPop that runs on JVM.
To work with nodes, we need to create vertices using Gremlin, which would be something like: vert1 = g.addVertex(); vert1.type = “visitante”; Gremlin allows configuring the direction each edge is linked, imagining recommendations of the type “who saw, also saw”.
We would have the initial node representing the product visited by the consumer, with edges pointing to each session where the customer visited and what other products they saw.
After that, Gremlin starts quantifying all the times this flow was executed and begins to quantify it.
A final edge can be drawn that represents “who saw, also saw”. Fauno is a graph analysis mechanism based on Hadoop for analyzing graphs represented through a multi-machine computing cluster.
It allows using the Gremlin traversal language and operates on a distributed graph database Titan or on HDFS in various text and binary formats.
The company has used Fauno, Rexster, Cassandra, and all other technologies to reveal all data to the business team that uses it