I am having a similar problem.
I created a Hive table using one column. Each row contains one XML record. Here is the script I used to create this first table:
CREATE EXTERNAL TABLE xml_event_table (
xmlevent string)
STORED AS TEXTFILE
LOCATION “/user/cloudera/vector/events”;
Here is a sample XML Event. Part of an XML Event
<Event xmlns=”http://schemas.microsoft.com/win/2004/08/events/event”><System><Provider Name=”Microsoft-Windows-Security-Auditing” Guid=”54849625-5478-4994-a5ba-3e3b0328c30d”></Provider> <EventID Qualifiers=””>4672</EventID> <Version>0</Version>…</Event>
I want to create a view that contains the EventID. But the xPath is not working correctly:
CREATE VIEW xpath_xml_event_view01(event_id, computer, user_id)
AS SELECT
xpath_string(xmlevent, ‘Event/System/EventID’),
xpath_string(xmlevent, ‘/Event[1]/System[1]/Computer’),
xpath_string(xmlevent, ‘/Event[1]/System[1]/EventID’)
FROM xml_event_table;