A good reference for Shell scripting
https://linuxcommand.org/lc3_writing_shell_scripts.php
SELECT reflect("java.lang.String", "valueOf", 1), reflect("java.lang.String", "isEmpty"), reflect("java.lang.Math", "max", 2, 3), reflect("java.lang.Math", "min", 2, 3), reflect("java.lang.Math", "round", 2.5), reflect("java.lang.Math", "exp", 1.0), reflect("java.lang.Math", "floor", 1.9)FROM src LIMIT 1;1 true 3 2 3 2.7182818284590455 1.0
If we pass NULL value to above method, it throws exception
hive> SELECT Reflect('org.apache.commons.codec.digest.DigestUtils', 'sha256Hex', NULL);
OK
Failed with exception java.io.IOException:
org.apache.hadoop.hive.ql.metadata.HiveException: UDFReflect getMethod
20/04/23 11:00:05 ERROR CliDriver: Failed with exception
java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: UDFReflect getMethod
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:154)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1693)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Surprisingly Hive handles it using CAST(null as string)
hive> select Reflect('org.apache.commons.codec.digest.DigestUtils', 'sha256Hex',
cast(null as string));
OK
UDFReflect evaluate java.lang.reflect.InvocationTargetException
method = public static java.lang.String org.apache.commons.codec.digest.
DigestUtilssha256Hex(java.lang.String)
args = [null]
NULL
DataFrame(DF)
|
DataSet(DS)
|
It is distributed collection of objects of type Row
|
It allows users to assign java class to the records inside DF
|
Not Type-Safe
|
Type-safe (compile time error check)
|
Scala, Java, Python and R
|
Scala and Java
|
Leverages Tungsten’s fast in-memory encoding
| |
Encoders are highly optimized and use run time code generation to build custom serde. As a result, it is faster than Java/Kyro serialization.
| |
Comparatively less in size.. Which will improve the network transfer speeds.
| |
Single interface usable in Java & Scala
|
A good reference for Shell scripting https://linuxcommand.org/lc3_writing_shell_scripts.php