Reflect function:
A Java class and method often exists to handle the exact function a user would like to use in Hive. Rather than having to write a wrapper UDF to call this method, the majority of these methods can be called using reflect UDF. Reflect uses Java reflection to instantiate and call methods of objects; it can also call static functions. The method must return a primitive type or a type that Hive knows how to serialize.
spark.sql(" select Reflect('org.apache.commons.codec.digest.DigestUtils', 'sha256Hex',cast(null as string))").show()
[Stage 427:> (0 + 1) / 1]20/04/23 11:25:43 WARN TaskSetManager: Lost task 0.0 in stage 427.0 469): java.lang.reflect.InvocationTargetException
Caused by: java.lang.NullPointerException
A Java class and method often exists to handle the exact function a user would like to use in Hive. Rather than having to write a wrapper UDF to call this method, the majority of these methods can be called using reflect UDF. Reflect uses Java reflection to instantiate and call methods of objects; it can also call static functions. The method must return a primitive type or a type that Hive knows how to serialize.
SELECT reflect(
"java.lang.String"
,
"valueOf"
,
1
),
reflect(
"java.lang.String"
,
"isEmpty"
),
reflect(
"java.lang.Math"
,
"max"
,
2
,
3
),
reflect(
"java.lang.Math"
,
"min"
,
2
,
3
),
reflect(
"java.lang.Math"
,
"round"
,
2.5
),
reflect(
"java.lang.Math"
,
"exp"
,
1.0
),
reflect(
"java.lang.Math"
,
"floor"
,
1.9
)
FROM src LIMIT
1
;
1
true
3
2
3
2.7182818284590455
1.0
SHA256 encoding using Java's DigestUtils and reflect method in Hive:
hive> SELECT Reflect('org.apache.commons.codec.digest.DigestUtils', 'sha256Hex', 'HANU');
OK
72102548c156fe16ed7ff108def7ddf19332d510b0afc29749a83dfd47787077
If we pass NULL value to above method, it throws exception
hive> SELECT Reflect('org.apache.commons.codec.digest.DigestUtils', 'sha256Hex', NULL);
OK
Failed with exception java.io.IOException:
org.apache.hadoop.hive.ql.metadata.HiveException: UDFReflect getMethod
20/04/23 11:00:05 ERROR CliDriver: Failed with exception
java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: UDFReflect getMethod
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:154)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1693)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Surprisingly Hive handles it using CAST(null as string)
hive> select Reflect('org.apache.commons.codec.digest.DigestUtils', 'sha256Hex',
cast(null as string));
OK
UDFReflect evaluate java.lang.reflect.InvocationTargetException
method = public static java.lang.String org.apache.commons.codec.digest.
DigestUtilssha256Hex(java.lang.String)
args = [null]
NULL
spark.sql(" select Reflect('org.apache.commons.codec.digest.DigestUtils', 'sha256Hex',cast(null as string))").show()
[Stage 427:> (0 + 1) / 1]20/04/23 11:25:43 WARN TaskSetManager: Lost task 0.0 in stage 427.0 469): java.lang.reflect.InvocationTargetException
Caused by: java.lang.NullPointerException