hadoop - How to give equations in Apache pig -
i trying value equation
--counted gives total row count in file samplecount = counted*(10/100);
how sample data according this
--load data examples = load '/home/sreeveni/myfiles/pe/uscensus1990new.csv' ; --group data groupedbyuser = group examples all; --count no of lines in file counted = foreach groupedbyuser generate count(examples) ; --sampling sampled = sample examples counted*(10/100); store sampled '/home/sreeveni/myfiles/out/samplesout';
showing error in above line
invalid scalar projection: counted : column needs projected relation used scalar
please advice.
am doing wrong.
i guess sample works number between [0,1]. in case, exceeding required value. if want 10% of data, pass 0.1 directly , in code, find percentage in foreach statement only.
Comments
Post a Comment