A good reference for Shell scripting
https://linuxcommand.org/lc3_writing_shell_scripts.php
SELECT reflect(
"java.lang.String"
,
"valueOf"
,
1
),
reflect(
"java.lang.String"
,
"isEmpty"
),
reflect(
"java.lang.Math"
,
"max"
,
2
,
3
),
reflect(
"java.lang.Math"
,
"min"
,
2
,
3
),
reflect(
"java.lang.Math"
,
"round"
,
2.5
),
reflect(
"java.lang.Math"
,
"exp"
,
1.0
),
reflect(
"java.lang.Math"
,
"floor"
,
1.9
)
FROM src LIMIT
1
;
1
true
3
2
3
2.7182818284590455
1.0
If we pass NULL value to above method, it throws exception
hive> SELECT Reflect('org.apache.commons.codec.digest.DigestUtils', 'sha256Hex', NULL);
OK
Failed with exception java.io.IOException:
org.apache.hadoop.hive.ql.metadata.HiveException: UDFReflect getMethod
20/04/23 11:00:05 ERROR CliDriver: Failed with exception
java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: UDFReflect getMethod
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:154)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1693)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Surprisingly Hive handles it using CAST(null as string)
hive> select Reflect('org.apache.commons.codec.digest.DigestUtils', 'sha256Hex',
cast(null as string));
OK
UDFReflect evaluate java.lang.reflect.InvocationTargetException
method = public static java.lang.String org.apache.commons.codec.digest.
DigestUtilssha256Hex(java.lang.String)
args = [null]
NULL
DataFrame(DF)
|
DataSet(DS)
|
It is distributed collection of objects of type Row
|
It allows users to assign java class to the records inside DF
|
Not Type-Safe
|
Type-safe (compile time error check)
|
Scala, Java, Python and R
|
Scala and Java
|
Leverages Tungsten’s fast in-memory encoding
| |
Encoders are highly optimized and use run time code generation to build custom serde. As a result, it is faster than Java/Kyro serialization.
| |
Comparatively less in size.. Which will improve the network transfer speeds.
| |
Single interface usable in Java & Scala
|
A good reference for Shell scripting https://linuxcommand.org/lc3_writing_shell_scripts.php