Sum up all values of a column in a text file

Recently, I had to sum up all integers of a column in a text file (similar to how you do in excel). After some digging up, I came up with a awk one liner to do it.

Following my tradition of documenting one liners, I am going to document this one as well 🙂

Input and output data

Here is the super simplified version of input data that I was using.

I wanted to find the sum of all values present in the 3rd column. So in the case, the output that I was expecting was 145

Command

Here is the awk one liner, which does this.

Explanation

awk script is executed for each line and the first part of the command creates a variable s that stores the sum of all values in the 3rd column.

When the end of file is reached, the second part of the command is executed, which just prints the value of the variable.

Field separator

If the columns are separated by a comma or by any other non whitespace character, then you have to just specify it by adding FS=',' to the above command.

The more I dig deeper into awk, the more I like it and it is really fascinating to see how much you can do with this tool.

I learned a lot about awk and hopefully this teaches something to you as well 😉

Related posts

Tags: , ,

7 Comments so far

Follow up comments through RSS Feed | Post a comment

  • bruno says:

    Here is a perl oneliner to keep you amused, here you go dude:

    perl -lane ‘$sum += $F[2]; END { print $sum }’ input.txt

    Fun trivia: there exist a bash one-liners (sort of ) way of acheieving this too……

    • Sudar says:

      perl !!?? 😉

      Anyways, I am interested in the bash one-liner though 🙂

      • bruno says:

        there you go mate, I feel bad for forgetting to post this.

        for i in $(cat remarks.txt | tr -s ‘ ‘ | cut -d’ ‘ -f3 ) ; do sum=$(echo ${sum:=0} + $i | bc) ; done ; echo $sum

        I know, bash can be irritating when it comes to floating points and yes, you might object to it not strictly being a one-liner because of ‘;’ so here comes the answer to that.

        FunTrivia again: (yeah because the riddler is my fave char.)

        You can do away with the for loops by using xargs, pipe and subshell.

        • Sudar says:

          Yup, sounds like a good alternative. But I still like the much shorter awk 😉

          • bruno says:

            before i forget it for another decade.

            cat input.txt | tr -s ‘ ‘ | cut -d’ ‘ -f3 | xargs -n 1 bash -c ‘sum=$(echo ${sum:=0} + $1 | bc ) ‘ {} ; echo $sum

            Neither of my one-liners are any good as an alternative to simplicity offered by awk but I usually play around with conversion from for loops to xargs. Nevertheless, It serves good as a party trick at geekspace.

            • Sudar says:

              May be I should checkout xargs more.

              • bruno says:

                If you haven’t , i urge you to spend some time with it. Once i got a basic hang for it, it helped me speedup a lot of slower runing scripts (parallelisation power) + ability for code reusability.

                One pitfall (you probably know it), always remember subshell creation and by extension the scope.

Leave a Reply to bruno Cancel reply

Your email address will not be published. Required fields are marked *