Graph-Powered Twitter Trend Analysis: Building Recommendations with Neo4j and Spring Data
Explore how to leverage Neo4j graph database for real-time Twitter trend analysis and user recommendations using Spring Data Neo4j and Spring Social.
Table of Contents
Introduction
Traditional relational databases excel at storing structured data, but they struggle when relationships become the primary focus of queries. Social networks, recommendation engines, and trend analysis are domains where graph databases shine, offering intuitive modeling and exceptional query performance for connected data.
This article explores a practical implementation that combines Neo4j - the leading graph database - with Spring Data Neo4j and Spring Social to analyze live Twitter streams, detect trends, and generate user recommendations.
Key Insight: Graph databases eliminate the need for complex JOIN operations by making relationships first-class citizens, enabling queries that would be impractical in relational systems.
Why Graph Databases for Social Analysis?
Event-Driven Architecture
The Problem with Relational Models
Consider modeling Twitter data in a relational database:
-- Users table
CREATE TABLE users (id BIGINT PRIMARY KEY, username VARCHAR(255));
-- Tweets table
CREATE TABLE tweets (id BIGINT PRIMARY KEY, user_id BIGINT, content TEXT);
-- Follows relationship
CREATE TABLE follows (follower_id BIGINT, following_id BIGINT);
-- Hashtags
CREATE TABLE hashtags (id BIGINT PRIMARY KEY, tag VARCHAR(255));
-- Tweet-Hashtag relationship
CREATE TABLE tweet_hashtags (tweet_id BIGINT, hashtag_id BIGINT);
A simple query like "find users who follow someone who tweeted about a trending topic" requires multiple JOINs and becomes exponentially complex as relationship depth increases.
The Graph Advantage
In Neo4j, the same data is modeled naturally:
┌─────────────────────────────────────────────────────────────────┐
│ Twitter Graph Model │
└─────────────────────────────────────────────────────────────────┘
(User:alice)─[:FOLLOWS]─>(User:bob)
│ │
[:POSTED] [:POSTED]
│ │
▼ ▼
(Tweet:t1) (Tweet:t2)
│ │
[:TAGGED] [:TAGGED]
│ │
▼ ▼
(Tag:#java) (Tag:#spring)
| Aspect | Relational | Graph |
|---|---|---|
| Relationship Queries | Complex JOINs | Native traversal |
| Performance | Degrades with depth | Constant regardless of size |
| Schema Evolution | Rigid migrations | Dynamic properties |
| Intuitive Modeling | Normalization required | Matches mental model |
Architecture Overview
Microservices Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Twitter Stream API │
└─────────────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Spring Social Twitter │
│ (OAuth + Stream Processing) │
└─────────────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Spring Boot Application │
├─────────────────────────────────────────────────────────────────┤
│ ┌─────────────────┐ ┌─────────────────┐ ┌────────────────┐ │
│ │ User Service │ │ Tweet Service │ │ Tag Service │ │
│ └────────┬────────┘ └────────┬────────┘ └───────┬────────┘ │
│ │ │ │ │
│ └────────────────────┼───────────────────┘ │
│ │ │
│ Spring Data Neo4j │
└────────────────────────────────┼────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Neo4j Graph Database │
│ (Dockerized, Bolt Protocol) │
└─────────────────────────────────────────────────────────────────┘
Technology Stack
| Component | Technology | Purpose |
|---|---|---|
| Database | Neo4j | Graph storage and Cypher queries |
| Framework | Spring Boot | Application infrastructure |
| Data Access | Spring Data Neo4j | OGM (Object-Graph Mapping) |
| Social Integration | Spring Social | Twitter API connectivity |
| Build Tool | Maven | Dependency management |
Implementation Deep Dive
Domain Model with Neo4j Annotations
The graph model is expressed using Neo4j OGM annotations:
// User Node
@NodeEntity
public class User {
@Id
@GeneratedValue
private Long id;
@Property("twitterId")
private String twitterId;
@Property("username")
private String username;
@Property("displayName")
private String displayName;
@Property("followersCount")
private int followersCount;
@Relationship(type = "POSTED", direction = Relationship.OUTGOING)
private Set<Tweet> tweets = new HashSet<>();
@Relationship(type = "FOLLOWS", direction = Relationship.OUTGOING)
private Set<User> following = new HashSet<>();
@Relationship(type = "FOLLOWS", direction = Relationship.INCOMING)
private Set<User> followers = new HashSet<>();
public void post(Tweet tweet) {
tweets.add(tweet);
tweet.setAuthor(this);
}
public void follow(User user) {
following.add(user);
}
// Constructors, getters, setters...
}
// Tweet Node
@NodeEntity
public class Tweet {
@Id
@GeneratedValue
private Long id;
@Property("tweetId")
private String tweetId;
@Property("content")
private String content;
@Property("createdAt")
private Date createdAt;
@Property("retweetCount")
private int retweetCount;
@Relationship(type = "POSTED", direction = Relationship.INCOMING)
private User author;
@Relationship(type = "TAGGED", direction = Relationship.OUTGOING)
private Set<Tag> tags = new HashSet<>();
@Relationship(type = "MENTIONS", direction = Relationship.OUTGOING)
private Set<User> mentions = new HashSet<>();
public void addTag(Tag tag) {
tags.add(tag);
tag.getTweets().add(this);
}
// Constructors, getters, setters...
}
// Tag (Hashtag) Node
@NodeEntity
public class Tag {
@Id
@GeneratedValue
private Long id;
@Property("name")
@Index(unique = true)
private String name;
@Property("tweetCount")
private int tweetCount;
@Relationship(type = "TAGGED", direction = Relationship.INCOMING)
private Set<Tweet> tweets = new HashSet<>();
// Constructors, getters, setters...
}
Repository Layer with Custom Cypher Queries
Spring Data Neo4j provides powerful repository support with custom Cypher queries:
public interface UserRepository extends Neo4jRepository<User, Long> {
Optional<User> findByTwitterId(String twitterId);
Optional<User> findByUsername(String username);
// Find users who follow a specific user
@Query("MATCH (u:User)-[:FOLLOWS]->(target:User {username: $username}) " +
"RETURN u")
List<User> findFollowersOf(@Param("username") String username);
// Recommendation: Users followed by people I follow (2nd degree)
@Query("MATCH (me:User {username: $username})-[:FOLLOWS]->(friend)-[:FOLLOWS]->(recommended) " +
"WHERE NOT (me)-[:FOLLOWS]->(recommended) AND me <> recommended " +
"RETURN recommended, count(friend) as mutualFriends " +
"ORDER BY mutualFriends DESC " +
"LIMIT $limit")
List<User> findRecommendedUsers(
@Param("username") String username,
@Param("limit") int limit);
// Find users who tweeted about a specific tag
@Query("MATCH (u:User)-[:POSTED]->(t:Tweet)-[:TAGGED]->(tag:Tag {name: $tagName}) " +
"RETURN DISTINCT u " +
"LIMIT $limit")
List<User> findUsersByTag(
@Param("tagName") String tagName,
@Param("limit") int limit);
}
public interface TweetRepository extends Neo4jRepository<Tweet, Long> {
@Query("MATCH (t:Tweet)-[:TAGGED]->(tag:Tag {name: $tagName}) " +
"RETURN t " +
"ORDER BY t.createdAt DESC " +
"LIMIT $limit")
List<Tweet> findByTag(
@Param("tagName") String tagName,
@Param("limit") int limit);
// Find trending tweets (most retweeted in last 24h)
@Query("MATCH (t:Tweet) " +
"WHERE t.createdAt > datetime() - duration('P1D') " +
"RETURN t " +
"ORDER BY t.retweetCount DESC " +
"LIMIT $limit")
List<Tweet> findTrendingTweets(@Param("limit") int limit);
}
public interface TagRepository extends Neo4jRepository<Tag, Long> {
Optional<Tag> findByName(String name);
// Find trending hashtags
@Query("MATCH (tag:Tag)<-[:TAGGED]-(t:Tweet) " +
"WHERE t.createdAt > datetime() - duration('P1D') " +
"RETURN tag, count(t) as tweetCount " +
"ORDER BY tweetCount DESC " +
"LIMIT $limit")
List<Map<String, Object>> findTrendingTags(@Param("limit") int limit);
// Find related tags (co-occurrence)
@Query("MATCH (tag:Tag {name: $tagName})<-[:TAGGED]-(t:Tweet)-[:TAGGED]->(related:Tag) " +
"WHERE tag <> related " +
"RETURN related, count(t) as coOccurrences " +
"ORDER BY coOccurrences DESC " +
"LIMIT $limit")
List<Map<String, Object>> findRelatedTags(
@Param("tagName") String tagName,
@Param("limit") int limit);
}
Twitter Stream Integration
The application connects to Twitter's streaming API to capture live tweets:
@Service
public class TwitterStreamService {
private final Twitter twitter;
private final TweetProcessor tweetProcessor;
public TwitterStreamService(Twitter twitter, TweetProcessor tweetProcessor) {
this.twitter = twitter;
this.tweetProcessor = tweetProcessor;
}
public void startStreamingByKeywords(List<String> keywords) {
StreamListener listener = new StreamListener() {
@Override
public void onTweet(Tweet tweet) {
tweetProcessor.process(tweet);
}
@Override
public void onDelete(StreamDeleteEvent deleteEvent) {
// Handle deletions if needed
}
@Override
public void onLimit(int numberOfLimitedTweets) {
log.warn("Rate limited: {} tweets", numberOfLimitedTweets);
}
@Override
public void onWarning(StreamWarningEvent warningEvent) {
log.warn("Stream warning: {}", warningEvent.getMessage());
}
};
FilterStreamParameters params = new FilterStreamParameters()
.track(keywords.toArray(new String[0]));
twitter.streamingOperations().filter(params, listener);
}
}
@Component
public class TweetProcessor {
private final UserRepository userRepository;
private final TweetRepository tweetRepository;
private final TagRepository tagRepository;
@Transactional
public void process(org.springframework.social.twitter.api.Tweet tweet) {
// Find or create user
User user = userRepository.findByTwitterId(
String.valueOf(tweet.getFromUserId()))
.orElseGet(() -> createUser(tweet));
// Create tweet node
Tweet tweetNode = new Tweet();
tweetNode.setTweetId(String.valueOf(tweet.getId()));
tweetNode.setContent(tweet.getText());
tweetNode.setCreatedAt(tweet.getCreatedAt());
tweetNode.setRetweetCount(tweet.getRetweetCount());
// Extract and link hashtags
extractHashtags(tweet.getText()).forEach(tagName -> {
Tag tag = tagRepository.findByName(tagName)
.orElseGet(() -> {
Tag newTag = new Tag();
newTag.setName(tagName);
return tagRepository.save(newTag);
});
tweetNode.addTag(tag);
tag.setTweetCount(tag.getTweetCount() + 1);
});
// Link to user
user.post(tweetNode);
userRepository.save(user);
}
private Set<String> extractHashtags(String text) {
Set<String> hashtags = new HashSet<>();
Matcher matcher = Pattern.compile("#(\\w+)").matcher(text);
while (matcher.find()) {
hashtags.add(matcher.group(1).toLowerCase());
}
return hashtags;
}
private User createUser(org.springframework.social.twitter.api.Tweet tweet) {
User user = new User();
user.setTwitterId(String.valueOf(tweet.getFromUserId()));
user.setUsername(tweet.getFromUser());
user.setDisplayName(tweet.getUser().getName());
user.setFollowersCount(tweet.getUser().getFollowersCount());
return user;
}
}
REST API Endpoints
The application exposes REST endpoints for querying the graph:
@RestController
@RequestMapping("/api")
public class TrendController {
private final UserRepository userRepository;
private final TweetRepository tweetRepository;
private final TagRepository tagRepository;
// GET /api/users - List all users
@GetMapping("/users")
public List<User> getUsers() {
return userRepository.findAll();
}
// GET /api/users/search/{username} - Search user
@GetMapping("/users/search/{username}")
public ResponseEntity<User> searchUser(@PathVariable String username) {
return userRepository.findByUsername(username)
.map(ResponseEntity::ok)
.orElse(ResponseEntity.notFound().build());
}
// GET /api/users/{username}/recommendations - Get recommendations
@GetMapping("/users/{username}/recommendations")
public List<User> getRecommendations(
@PathVariable String username,
@RequestParam(defaultValue = "10") int limit) {
return userRepository.findRecommendedUsers(username, limit);
}
// GET /api/tweets - List tweets
@GetMapping("/tweets")
public List<Tweet> getTweets() {
return tweetRepository.findAll();
}
// GET /api/tweets/trend - Get trending tweets
@GetMapping("/tweets/trend")
public List<Tweet> getTrendingTweets(
@RequestParam(defaultValue = "20") int limit) {
return tweetRepository.findTrendingTweets(limit);
}
// GET /api/tags - List hashtags
@GetMapping("/tags")
public List<Tag> getTags() {
return tagRepository.findAll();
}
// GET /api/tags/trending - Get trending hashtags
@GetMapping("/tags/trending")
public List<Map<String, Object>> getTrendingTags(
@RequestParam(defaultValue = "10") int limit) {
return tagRepository.findTrendingTags(limit);
}
// GET /api/tags/{tagName}/related - Get related hashtags
@GetMapping("/tags/{tagName}/related")
public List<Map<String, Object>> getRelatedTags(
@PathVariable String tagName,
@RequestParam(defaultValue = "10") int limit) {
return tagRepository.findRelatedTags(tagName, limit);
}
}
Configuration
Application Properties
# Neo4j Connection
spring.data.neo4j.uri=bolt://localhost:7687
spring.data.neo4j.username=neo4j
spring.data.neo4j.password=secret
# Twitter API Credentials
spring.social.twitter.appId=YOUR_APP_ID
spring.social.twitter.appSecret=YOUR_APP_SECRET
spring.social.twitter.accessToken=YOUR_ACCESS_TOKEN
spring.social.twitter.accessTokenSecret=YOUR_ACCESS_TOKEN_SECRET
# Logging
logging.level.org.neo4j=INFO
logging.level.org.springframework.data.neo4j=DEBUG
Docker Setup for Neo4j
# Run Neo4j with Docker
docker run \
--name neo4j-twitter \
-p 7474:7474 \
-p 7687:7687 \
-e NEO4J_AUTH=neo4j/secret \
-v $HOME/neo4j/data:/data \
-d neo4j:latest
Running the Application
# Clone the repository
git clone https://github.com/mgorav/neo4j-twitter-trend-recomendentation.git
cd neo4j-twitter-trend-recomendentation
# Start Neo4j
docker run -d --name neo4j -p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/secret neo4j:latest
# Configure Twitter credentials in application.properties
# Build and run
mvn clean install
mvn spring-boot:run
API Endpoints Summary
| Endpoint | Method | Description |
|---|---|---|
/api/users | GET | List all users |
/api/users/search/{username} | GET | Search user by username |
/api/users/{username}/recommendations | GET | Get user recommendations |
/api/tweets | GET | List all tweets |
/api/tweets/trend | GET | Get trending tweets |
/api/tags | GET | List all hashtags |
/api/tags/trending | GET | Get trending hashtags |
/api/tags/{tagName}/related | GET | Get related hashtags |
Use Cases and Applications
Graph databases with social data analysis are particularly valuable in:
| Industry | Use Case |
|---|---|
| Financial Services | Fraud detection through relationship patterns |
| Marketing | Influencer identification and campaign targeting |
| IoT | Device relationship and network topology analysis |
| Security | Threat actor network mapping |
| E-commerce | Recommendation engines |
Conclusion
Combining Neo4j with Spring Data Neo4j provides an elegant solution for social data analysis. The graph model naturally represents relationships, while Cypher queries enable complex traversals with simple, readable syntax. Key takeaways:
- Graph databases excel when relationships are the primary focus
- Spring Data Neo4j provides familiar repository patterns for graph access
- Cypher queries enable powerful graph traversals
- Real-time streaming with Spring Social enables live data capture
- Recommendation algorithms become natural graph queries
The neo4j-twitter-trend-recomendentation project demonstrates these concepts in a production-ready implementation that can be extended for various social analysis use cases.