Setup

This topic assumes the product is installed to /precisely/li/software as described in Installing the SDK. To set up user-defined spatial functions for Hive, perform the following steps:

Proceed according to your platform.

On this platform Do this

Cloudera

On this platform	Do this
Cloudera	Copy the Hive jar for Location Intelligence to the HiveServer node. `/precisely/li/software/hive/lib/spectrum-bigdata-li-hive-version.jar` In Cloudera Manager, navigate to the Hive Configuration page. Search for the Hive Auxiliary JARs Directory setting. If the value is already set then move the Hive jar into the specified folder. If the value is not set then set it to the parent folder of the Hive jar. `/precisely/li/software/hive/lib/`
Hortonworks	On the HiveServer2 node, create the Hive auxlib folder if one does not already exist. `sudo mkdir /usr/hdp/current/hive-server2/auxlib/` Copy the Hive jar for Location Intelligence to the auxlib folder on the HiveServer2 node: `sudo cp /precisely/li/software/hive/lib/ spectrum-bigdata-li-hive-version.jar /usr/hdp/current/hive-server2/auxlib/`

Copy the Hive jar for Location Intelligence to the HiveServer node.

/precisely/li/software/hive/lib/spectrum-bigdata-li-hive-version.jar

In Cloudera Manager, navigate to the Hive Configuration page. Search for the Hive Auxiliary JARs Directory setting. If the value is already set then move the Hive jar into the specified folder. If the value is not set then set it to the parent folder of the Hive jar.

/precisely/li/software/hive/lib/

Hortonworks

On the HiveServer2 node, create the Hive auxlib folder if one does not already exist.

sudo mkdir /usr/hdp/current/hive-server2/auxlib/

Copy the Hive jar for Location Intelligence to the auxlib folder on the HiveServer2 node:

sudo cp /precisely/li/software/hive/lib/
spectrum-bigdata-li-hive-version.jar 
/usr/hdp/current/hive-server2/auxlib/

Restart all Hive services.
Launch Beeline, or some other Hive client, for the remaining step.
```
beeline -u jdbc:hive2://localhost:10000/default -n sdkuser
```

Register spatial user-defined functions. Add the temporary keyword after create if you want a temporary function (this step would need to be redone for every new Hive session).

create temporary function FromWKT as 'com.pb.bigdata.spatial.hive.construct.FromWKT';
create temporary function FromWKB as 'com.pb.bigdata.spatial.hive.construct.FromWKB';
create temporary function FromKML as 'com.pb.bigdata.spatial.hive.construct.FromKML';
create temporary function FromGeoJSON as 'com.pb.bigdata.spatial.hive.construct.FromGeoJSON';
create temporary function ST_Point as 'com.pb.bigdata.spatial.hive.construct.ST_Point';

create temporary function ToWKT as 'com.pb.bigdata.spatial.hive.persistence.ToWKT';
create temporary function ToWKB as 'com.pb.bigdata.spatial.hive.persistence.ToWKB';
create temporary function ToKML as 'com.pb.bigdata.spatial.hive.persistence.ToKML';
create temporary function ToGeoJSON as 'com.pb.bigdata.spatial.hive.persistence.ToGeoJSON';

create temporary function Disjoint as 'com.pb.bigdata.spatial.hive.predicate.Disjoint';
create temporary function Overlaps as 'com.pb.bigdata.spatial.hive.predicate.Overlaps';
create temporary function Within as 'com.pb.bigdata.spatial.hive.predicate.Within';
create temporary function Intersects as 'com.pb.bigdata.spatial.hive.predicate.Intersects';
create temporary function IsNullGeometry as 'com.pb.bigdata.spatial.hive.predicate.IsNullGeometry';

create temporary function Area as 'com.pb.bigdata.spatial.hive.measurement.Area';
create temporary function ClosestPoints as 'com.pb.bigdata.spatial.hive.measurement.ClosestPoints';
create temporary function Distance as 'com.pb.bigdata.spatial.hive.measurement.Distance';
create temporary function Length as 'com.pb.bigdata.spatial.hive.measurement.Length';
create temporary function Perimeter as 'com.pb.bigdata.spatial.hive.measurement.Perimeter';

create temporary function ConvexHull as 'com.pb.bigdata.spatial.hive.processing.ConvexHull';
create temporary function Intersection as 'com.pb.bigdata.spatial.hive.processing.Intersection';
create temporary function Buffer as 'com.pb.bigdata.spatial.hive.processing.Buffer';
create temporary function Union as 'com.pb.bigdata.spatial.hive.processing.Union';
create temporary function Transform as 'com.pb.bigdata.spatial.hive.processing.Transform';

create temporary function ST_X as 'com.pb.bigdata.spatial.hive.observer.ST_X';
create temporary function ST_XMax as 'com.pb.bigdata.spatial.hive.observer.ST_XMax';
create temporary function ST_XMin as 'com.pb.bigdata.spatial.hive.observer.ST_XMin';
create temporary function ST_Y as 'com.pb.bigdata.spatial.hive.observer.ST_Y';
create temporary function ST_YMax as 'com.pb.bigdata.spatial.hive.observer.ST_YMax';
create temporary function ST_YMin as 'com.pb.bigdata.spatial.hive.observer.ST_YMin';

create temporary function GeoHashBoundary as 'com.pb.bigdata.spatial.hive.grid.GeoHashBoundary';
create temporary function GeoHashID as 'com.pb.bigdata.spatial.hive.grid.GeoHashID';
create temporary function HexagonBoundary as 'com.pb.bigdata.spatial.hive.grid.HexagonBoundary';
create temporary function HexagonID as 'com.pb.bigdata.spatial.hive.grid.HexagonID';
create temporary function SquareHashBoundary as 'com.pb.bigdata.spatial.hive.grid.SquareHashBoundary';
create temporary function SquareHashID as 'com.pb.bigdata.spatial.hive.grid.SquareHashID';

create temporary function LocalSearchNearest as 'com.pb.bigdata.spatial.hive.search.LocalSearchNearest';
create temporary function LocalPointInPolygon as 'com.pb.bigdata.spatial.hive.search.LocalPointInPolygon';

Note: If you want to view the complete stack trace for any encountered error, enable logging in DEBUG mode and then restart the job execution.

The first time you run a job may take a while if the reference data has to be downloaded remotely from HDFS or S3. It may also time out when using a large number of datasets that are stored in remote locations such as HDFS or S3. If you are using Hive with the MapReduce engine, you can adjust the value of the mapreduce.task.timeout property.
Some types of queries will cause Hive to evaluate UDFs in the HiveServer2 process space instead of on a data node. The Routing UDFs in particular use a significant amount of memory and can shut down the Hive server due to memory constraints. To process these queries, we recommend increasing the amount of memory available to the HiveServer2 process (for example, by setting HADOOP_HEAPSIZE in hive-env.sh).