Stand With Ukraine

Thursday, October 24, 2013

IT Starz result

Not great but still I think it has to be saved here for my records))

Thursday, August 15, 2013

Correcting netcdf file data and metadata using NCO

Sometimes we get data in netcdf files that do not really conform to the standards used by different tools that were developed for calculating standard properties of the data. Here an example is shown how the files of this type can be easily corrected using NCO and shell:

#!/bin/bash
folder=.
prefix=ANUSPLIN_latlon_stmn_
var_name=daily_minimum_temperature
for x in ${folder}/${prefix}*.nc
do
x_name=$(basename $x)
part=$(echo $x_name | cut -d"." -f 1)
#get the year and month from the file name
y=$(echo $part | cut -d"_" -f 4)
m=$(echo $part | cut -d"_" -f 5)
#change time units
ncatted -O -a units,time,m,c,"days since ${y}-${m}-01" $x
# decrement the time values so they start from 0
ncap2 -O -s "time=time-1;" $x -o $x
#replace NaNs with a value specified in the missing_value attribute
ncap2 -O -s "where(${var_name} != ${var_name}){${var_name}=${var_name}@missing_value;}" $x -o $x
done
view raw changer.sh hosted with ❤ by GitHub


After having run the script the command `cdo infov fname` should give correct dates, minimum and maximum values (which previously was not the case):
cdo infov ANUSPLIN_latlon_stmn_2005_11.nc
-1 : Date Time Level Gridsize Miss : Minimum Mean Maximum : Parameter name
1 : 2005-11-01 00:00:00 0 544680 287417 : -33.800 -10.348 10.620 : daily_minimum_temperature
2 : 2005-11-02 00:00:00 0 544680 287417 : -38.890 -11.761 11.000 : daily_minimum_temperature
3 : 2005-11-03 00:00:00 0 544680 287417 : -42.150 -12.516 11.690 : daily_minimum_temperature
4 : 2005-11-04 00:00:00 0 544680 287417 : -46.670 -12.967 13.740 : daily_minimum_temperature
5 : 2005-11-05 00:00:00 0 544680 287417 : -45.040 -13.658 13.910 : daily_minimum_temperature
6 : 2005-11-06 00:00:00 0 544680 287417 : -45.100 -14.014 11.840 : daily_minimum_temperature
7 : 2005-11-07 00:00:00 0 544680 287417 : -43.120 -13.841 10.800 : daily_minimum_temperature
8 : 2005-11-08 00:00:00 0 544680 287417 : -41.860 -13.430 10.730 : daily_minimum_temperature
9 : 2005-11-09 00:00:00 0 544680 287417 : -42.100 -13.669 11.880 : daily_minimum_temperature
10 : 2005-11-10 00:00:00 0 544680 287417 : -41.200 -12.747 9.1600 : daily_minimum_temperature
11 : 2005-11-11 00:00:00 0 544680 287417 : -44.540 -13.257 7.7200 : daily_minimum_temperature
12 : 2005-11-12 00:00:00 0 544680 287417 : -43.940 -14.471 9.4400 : daily_minimum_temperature
13 : 2005-11-13 00:00:00 0 544680 287417 : -43.750 -15.916 12.520 : daily_minimum_temperature
14 : 2005-11-14 00:00:00 0 544680 287417 : -46.490 -17.539 8.9400 : daily_minimum_temperature
15 : 2005-11-15 00:00:00 0 544680 287417 : -40.260 -18.271 8.3900 : daily_minimum_temperature
16 : 2005-11-16 00:00:00 0 544680 287417 : -38.060 -18.373 8.1000 : daily_minimum_temperature
Here for comparison I've put the result of the same command on the initial version of the file:
cdo infov ANUSPLIN_latlon_stmn_2005_11.nc
-1 : Date Time Level Gridsize Miss : Minimum Mean Maximum : Parameter name
1 : 0000-00-01 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
2 : 0000-00-02 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
3 : 0000-00-03 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
4 : 0000-00-04 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
5 : 0000-00-05 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
6 : 0000-00-06 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
7 : 0000-00-07 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
8 : 0000-00-08 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
9 : 0000-00-09 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
10 : 0000-00-10 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
11 : 0000-00-11 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
12 : 0000-00-12 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
13 : 0000-00-13 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
14 : 0000-00-14 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
15 : 0000-00-15 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
16 : 0000-00-16 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature
17 : 0000-00-17 00:00:00 0 544680 0 : nan nan nan : daily_minimum_temperature

Wednesday, July 24, 2013

Parsing model log file using bash, grep, head, tail and cat ...

In this script you can find an example of using grep with perl regular expressions, mathematical expressions in bash script, array, head and tail and cut (a simplistic text splitting utility for shell).

#!/bin/bash
#set -x
echo $1
found=$(fgrep --text "At" $1 | fgrep "User" | fgrep "Time" | fgrep "(sec)")
#echo "$found"
echo "-----------------------------------------------------------------------"
echo "start: " $(echo "$found" | head -1 | cut -f 1 -d"," )
echo "currently (or finished): " $(echo "$found" | tail -1 | cut -f 1 -d"," )
echo "-----------------------------------------------------------------------"
#echo "Current time step is: "
last_progres_line=$(tail -1000 $1 | fgrep "TIMESTEP" | fgrep "OUT OF" | tail -1)
#select numbers from a phrase
groups=$(echo $last_progres_line | grep -Po '\d+')
i=0
for y in $groups
do
counts[${i}]=$y
let i=$(( ${i}+1 ))
done
total_steps=${counts[1]}
last_finished_step=${counts[0]}
nsteps=$( fgrep --text "TIMESTEP" $1 | fgrep "OUT OF" | wc -l)
echo "Performed ${nsteps} steps"
steps_todo=$(( ${total_steps} - ${last_finished_step} + 1 ))
echo "${steps_todo} steps left ..."
#egrep "At" $1 | egrep "User" | egrep "Time" | egrep "(sec)" | tail -1

Friday, May 10, 2013

Bash: select a column from a file

If for example, you have a file with columns separated by spaces and you want to select a column of data from there, say number 9. This is easy to do in shell using the cut command (note that fields are indexed starting from 1):

cut -d" " -f9 file_list.txt > file_list1.txt
view raw gistfile1.sh hosted with ❤ by GitHub

Sunday, February 10, 2013

make is a very useful tool

I try to use Makefile in my small C or Fortran projects (for java projects I like to use maven with its pom.xml). And the more I get to know it the more I like it. This post shows an example of a Makefile created for a small project that I am trying to maintain and improve. The example demonstrates how to use if, how to compile sources from different directories and gather obtained object files to a common directory, how to avoid caveats using the wildcard function, how to get file name using notdir function. Of course all of this can be found using Google search, StackOverflow and GNU manual, those are my actual sources while creating this make file. Enjoy, and, please, comment if you see ways of improvement.
SUITE = gcc
#librmn = $(ARMNLIB)/lib/Linux/librmn.a
#@echo "using suite $(SUITE) for build"
ifeq ($(SUITE), pgi)
boost_root = /rech/huziy/Programs/boost
#@echo "using suite $(SUITE) for build"
NETCDF = $(HOME)/rech_progs_40Gb/Programs/netcdf
CXX_FLAGS = -O0 -g -c -I$(NETCDF)/include -I/software/libraries/boost-1.47/include
endif
ifeq ($(SUITE),gcc)
#module load NETCDF/4.1.3-gcc
NETCDF = /sb/software/CentOS-5/libraries/netcdf/4.1.3-gcc
endif
ifeq ($(SUITE), pgi)
CXX = pgCC
FC = pgf90
FORTRAN_FLAGS = -i8 -r8 -byteswapio -c -Mpreprocess -g -Mbounds
#for boost
boost_include = -I$(boost_root)/include -L$(boost_root)/lib
LIBS = $(CLIMLIB)/lib/losub.a -lm $(NETCDF)/lib/libnetcdf_c++.a $(NETCDF)/lib/libnetcdf.a -lcurl -lhdf5_hl -lhdf5 -pgf90libs -lboost_date_time
endif
ifeq ($(SUITE), gcc)
CXX = /software/CentOS-5/compilers/gcc-4.7.2/bin/g++
FC = /software/CentOS-5/compilers/gcc-4.7.2/bin/gfortran
FORTRAN_FLAGS = -i8 -c -g -fbounds-check
LIBS = -L/software/tools/hdf5/1.8.7-gcc/lib \
-L/software/libraries/boost-1.47/lib \
-L/software/CentOS-5/compilers/gcc-4.7.2/lib64 \
-L/software/CentOS-5/compilers/gcc-4.7.2/lib \
-L$(NETCDF)/lib -lnetcdf_c++ -lcurl -lhdf5_hl -lhdf5
LINK_FLAGS = -g -v -fsignaling-nans
CXX_FLAGS = -O0 -g -c -I$(NETCDF)/include -I/software/libraries/boost-1.47/include -fsignaling-nans
endif
CPP_FILES := $(wildcard *.cpp) \
$(wildcard boost/date_time/gregorian/*.cpp boost/date_time/posix_time/*.cpp) #boost/filesystem/*/*.cpp
OBJ_DIR = obj
OBJS = $(patsubst %.cpp, $(OBJ_DIR)/%.o, $(notdir $(CPP_FILES)))
FORTRANFILES = $(wildcard *.f90)
TARGET = meshroute.exe
test :
@echo $(CPP_FILES)
@echo "=================================================================="
@echo $(OBJS)
#compile .cpp and .f90 files
#$(OBJS):
#@echo Boost libs $(LIBS_BOOST)
# @echo NETCDF=$(NETCDF)
# $(CXX) $(CXX_FLAGS) $(CPP_FILES)
#$(FC) $(FORTRAN_FLAGS) $(FORTRANFILES)
#resultfile: inputfile
# action
$(TARGET): $(OBJS)
@echo $(OBJS)
$(CXX) -o $@ $^ $(LIBS) $(LINK_FLAGS)
obj/%.o: %.cpp
mkdir -p $(OBJ_DIR)
$(CXX) $(CXX_FLAGS) $< -o $@
obj/%.o: boost/date_time/gregorian/%.cpp
mkdir -p $(OBJ_DIR)
$(CXX) $(CXX_FLAGS) $< -o $@
obj/%.o: boost/date_time/posix_time/%.cpp
mkdir -p $(OBJ_DIR)
$(CXX) $(CXX_FLAGS) $< -o $@
all: $(TARGET)
rm -f *.o
clean:
rm -f $(OBJS) $(TARGET)