i’m using the CSVupdateHandler to index CSV files into Solr. my Csv files have variable number of fields in every line ( eg 4 fields in line one 6 in line 2 and so on … ).
line1:field1,field2,field3,field4
line2:field1,field2,field3,field4,field5,field6
line3:field1,field2,field3,field4
So is there a way to specify variable no of fieldnames ?? what i want it to do is to index 4 colums if there are four fields and index 6 if there are six. any other alternative way to achieve this is appreciated too 🙂 thanks !
UPDATE :
let me describe the situation ....
i have a file with CSV data like show above. i use the fieldnames parameter to specify the field names that Solr has to use. since every LINE in my file does not have a Set number of CSValues i cannot have a standard header set for this file without me having to pad some lines with null values. Eg. when i upload the above file with 6 header fields defined lines 1 and 3 will throw an error and if i use 4 header fields line 2 throws an error.. iwant to know if there is a way to specify the header fields such that the above condition works …or do i have to transform my file into eqal length fields with padded dummy values ??
solved this : specify custom fields with default values in schema.xml. to account for the extra two fields in some of the lines ! the schema.xml provided has plenty of examples !!
ALTERNATE : u can also define a custom updateRequestProcessor and add fields based on conditions using java . and specify this processor as a part of the update processor chain in your request handler.