Hi,
I am encountering an unique issue that I am unable to resolve and not able understand where the issue is located. I am trying to Import incremental data from an Oracle table to HDFS using Sqoop. For simplicity, here is the sample table structure and data:
message_id (Number Datatype) updated_time (Date Datatype)
1000 3/14/2008 10:39:24 AM
1001 3/14/2008 08:40:30 AM
1004 4/15/2008 12:10:15 PM
When I Sqoop import the above data using the “lastmodified” incremental mode and having check column on “Updated_Time” to HDFS, the values that are stored in HDFS look like below:
messageid~updated_time
1000~2008-03-14
1001~2008-03-14
1004~2008-04-15
ISSUE: For some reason, the timestamp portion of the date is truncated automatically (or) not stored in HDFS.
However, if I add an additional column to convert the same date value to to_char as given below, then i am able to store the desired result but not it is a string value.
sqoop –options-file /home/options_file.txt –password-file /home/common/password.txt –query ‘select MESSAGE_ID, UPDATED_DATETIME, to_char(UPDATED_DATETIME, ‘\’YYYY-MM-DD’ ‘HH24:MI:SS\”) as UPDATED_DATETIME_CHAR FROM schema_name.table_name where $CONDITIONS’ –target-dir /user/m_m –incremental “lastmodified” –check-column “UPDATED_DATETIME” –last-value “2001-01-01 00:00:00″ –fields-terminated-by ‘~’ -m 1
RESULT:
1000~2008-03-14~2008-03-14 10:39:24
1001~2008-03-14~2008-03-14 08:40:30
1004~2008-04-15~2008-04-15 12:10:15
I do not want to have additional TO_CHAR column for a date column to convert its datatype in order to store the information and also for any other date columns in the same table if i have to store in HDFS then they require TO_CHAR as otherwise only date portion is stored.
Could anyone advise if they had encountered similar issue and possible solution(s) to resolve this at the earliest ? I’ve found few articles on internet to change the configurations of mapDatetoTimestamp , etc but they did not help.
Thanks,
Pavan