Open the file IncidentClusterRepository.java
from the directory src/main/java/com/sap/hana/hibernate/sample/repositories
.
The findClusters
method needs to be implemented to perform the clustering per category in the database.
As there is no support for spatial clustering in Hibernate, this requires a native query. However, Hibernate can still automatically map the query results to Java objects using a SqlResultSetMapping
.
An appropriate result set mapping named incidentCluster
is already defined on the IncidentCluster
entity in the file src/main/java/com/sap/hana/hibernate/sample/entities/IncidentCluster.java
.
@SqlResultSetMappings(@SqlResultSetMapping(name = "incidentCluster", classes = {
@ConstructorResult(targetClass = IncidentCluster.class, columns = {
@ColumnResult(name = "clusterId", type = Long.class),
@ColumnResult(name = "convexHull", type = Geometry.class),
@ColumnResult(name = "numberOfIncidents", type = Long.class),
@ColumnResult(name = "category", type = String.class)
})
}))
This mapping tells Hibernate how to create an IncidentCluster
object from the columns clusterId
, convexHull
, numberOfIncidents
, and category
in each row of the SQL result set.
The first thing the findClusters
method needs to do is to determine for which categories the clusters should be created. If no categories were given by the user, all categories should be used, otherwise only those given by the user.
Next, the clustering query needs to be constructed. It uses the GROUP CLUSTER BY
functionality to do the clustering on the map_location
column. The cluster algorithm used is DBSCAN
with an epsilon value of 200 and a minimum cluster size of 3000 points. These values work well for the data set, but you can play around with them to see how the clusters change for different values.
The conditions in the where clause restrict the incidents to the values specified by the user, i.e. to the date range and the distance around the current location.
It is also specified that the incidentCluster
result set mapping should be used when creating the query. The query select list uses the ST_ClusterID
function to return the cluster ID, the ST_ConvexHullAggr
function to calculate the shape of the cluster, and COUNT(*)
to return the number of incidents in each cluster. The columns of the select list must be aliased to match the column names of the result set mapping.
Finally, the clusters are calculated for each category by executing the query against the database. The clusters for each category are added to the final output list if the cluster ID is greater than 0 (cluster ID 0 means noise) and if the cluster shape is a polygon (some “clusters” consist of a single point or a line if there are very many incidents at one location, which isn’t all that interesting to show on a map).
public List<IncidentCluster> findClusters(Point<G2D> location, Distance distance, Date dateFrom, Date dateTo, List<String> categoryList) {
List<IncidentCluster> resultList = new ArrayList<>();
List<String> clusterCategories;
if ( categoryList == null || categoryList.isEmpty() ) {
clusterCategories = this.findCategories();
}
else {
clusterCategories = categoryList;
}
Query query = this.em.createNativeQuery(
"SELECT "
+ " ST_ClusterID() as clusterId, "
+ " ST_ConvexHullAggr(i.map_location).ST_Transform(4326).ST_AsEWKB() as convexHull, "
+ " COUNT(*) as numberOfIncidents, "
+ " :category as category "
+ "FROM Incident i "
+ "WHERE i.date between :dateFrom and :dateTo "
+ " AND i.category = :category "
+ " AND i.map_location.ST_WithinDistance(ST_GeomFromEWKB(:location).ST_Transform(7131), :distance) = 1 "
+ "GROUP CLUSTER BY i.map_location USING DBSCAN EPS :eps MINPTS :minpts",
"incidentCluster" );
query.setParameter( "dateFrom", dateFrom );
query.setParameter( "dateTo", dateTo );
query.setParameter( "location", location );
query.setParameter( "distance", distance );
query.setParameter( "eps", 200 );
query.setParameter( "minpts", 3000 );
for ( String category : clusterCategories ) {
query.setParameter( "category", category );
List<IncidentCluster> categoryClusters = query.getResultList();
for ( IncidentCluster cluster : categoryClusters ) {
if ( cluster.getClusterId() > 0 && cluster.getConvexHull().getGeometryType() == GeometryType.POLYGON ) {
resultList.add( cluster );
}
}
}
return resultList;
}
Save the IncidentClusterRepository.java
file.