Processing 2D kinetics in nmrPipe with nlinLS

1. Process first spectrum

Spectrum should be converted from Bruker «ser» file to nmrPipe format. For pulse sequence «hsqcetfpf3gp» you can use this script:

fid.com:

#!/bin/csh

bruk2pipe -in ./ser \
    -bad 0.0 -aswap -DMX -decim 2856 \
    -dspfvs 20 -grpdly 67.9860382080078 \
    -xN 2048 -yN 512 \
    -xT 1024 -yT 256 \
    -xMODE DQD -yMODE Echo-AntiEcho \
    -xSW 7002.801 -ySW 4055.150 \
    -xOBS 500.032 -yOBS 50.673 \
    -xCAR 4.773 -yCAR 110.085 \
    -xLAB HN -yLAB 15N \
    -ndim 2 -aq2D TPPI \
    -out ./test.fid -verb -ov

You just need to adjust all the parameters according to your data.

For pulse sequence «b_hsqcetf3gpsi» you can use this script:

fid.com:

#!/bin/csh

bruk2pipe -in ./ser \
  -bad 0.0 -aswap -DMX -decim 2496 -dspfvs 20   \
  -grpdly 67.9842376708984                      \
  -xN              1024  -yN               256  \
  -xT               400  -yT               128  \
  -xMODE            DQD  -yMODE  Echo-AntiEcho  \
  -xSW         8012.821  -ySW         2534.212  \
  -xOBS         500.032  -yOBS          50.674  \
  -xCAR           4.772  -yCAR         117.073  \
  -xLAB              HN  -yLAB             15N  \
  -ndim               2  -aq2D            TPPI  \
  -out ./test.fid -verb -ov

After converting the spectrum you should process time domain data to obtain frequency domain data. In order to do that you should perform Fourier Transform. Before that you can apply window functions to your FIDs. After that you should do the phase correction. All these things could be done with following scrips. For pulse sequence «hsqcetfpf3gp» you do not need the linear prediction in ¹H dimension so the script will look like this:

nmrproc.com:

#!/bin/csh

nmrPipe -in ./test.fid      \
| nmrPipe -fn  POLY -time \
| nmrPipe -fn  SINE -c 0.5 -off 0.45 -end 0.98 -pow 2 \
| nmrPipe -fn  ZF   -zf 2 -auto \
| nmrPipe -fn  FT -auto \
| nmrPipe -fn  PS   -p0 142.1 -p1 0.0 -di \
| nmrPipe -fn  EXT -x1 5.9ppm -xn 9.8ppm -sw -verb      \
| nmrPipe -fn  TP \
| nmrPipe -fn  SINE -c 0.5 -off 0.45 -end 0.98 -pow 2 \
| nmrPipe -fn  LP -x1 1 -xn 512 -ord 32 -f -pred 512 -after \
| nmrPipe -fn  ZF -zf 2 -auto \
| nmrPipe -fn  FT -auto \
| nmrPipe -fn  PS   -p0 -90.0 -p1 0.0 -di \
| nmrPipe -fn  EXT -x1 100.0ppm -xn 132.0ppm -sw -verb      \
| nmrPipe -fn  TP \
| nmrPipe -out ./test.ft2 -ov

For BEST-HSQC pulse sequence «b_hsqcetf3gpsi» it is better to apply linear prediction in ¹H dimension. Additional line will be introduced before FT in ¹H dimension:

nmrproc.com:

#!/bin/csh

nmrPipe -in ./test.fid      \
| nmrPipe -fn  POLY -time \
| nmrPipe -fn  SP -off 0.5 -end 0.98 -pow 2 -c 0.5 \
| nmrPipe -fn  ZF -size 2048 \
| nmrPipe -fn  FT -auto \
| nmrPipe -fn  PS   -p0 -154.2 -p1 0.0 -di \
| nmrPipe -fn  EXT -x1 5.9ppm -xn 9.8ppm -sw -verb      \
| nmrPipe -fn  TP \
| nmrPipe -fn  LP -x1 1 -xn 128 -ord 32 -f -pred 256 -after \
| nmrPipe -fn  SP -off 0.5 -end 0.98 -pow 2 -c 0.5 \
| nmrPipe -fn  ZF -size 2048 \
| nmrPipe -fn  FT -auto \
| nmrPipe -fn  PS   -p0 90.0 -p1 0.0 -di \
| nmrPipe -fn  EXT -x1 100.0ppm -xn 132.0ppm -sw -verb      \
| nmrPipe -fn  TP \
| nmrPipe -out ./test.ft2 -ov

By changing scripts fid.com and nmrproc.com you should get the 2D spectrum, it should be phased, digital resolution should be big enough (apply zero filling), and so on.

You can view the spectra with nmrDraw:

nmrDraw

2. Process all other spectra

Since you have conversion and processing scripts for the first spectrum, you can apply it to all other spectra if they have the same acquisition parameters.

You can use this bash script in order to process all your data at once:

superproc.sh:

#!/bin/bash

# set the path to fid.com and nmrproc.com scripts
SCRIPT_PATH="../scripts\ and\ tables"

for i in $(ls */ser | sort -n)
  do dn=`dirname $i`
  echo $dn
  cd $dn
  cp $SCRIPT_PATH/fid.com .
  cp $SCRIPT_PATH/nmrproc.com .
  ./fid.com
  ./nmrproc.com
  rm fid.com nmrproc.com
  cd ..
done

3. Pick peaks

During spectra processing you can open nmrDraw with first spectrum and pick peaks. Run the nmrDraw and load your spectrum. Press Shift + K and window for peak detection will open. Type the name for table in which your peaks will be stored and press Detect button. This will create script «pk.tcl» in current folder and run it. Table of peaks will be produced. You can tune peak picking parameters to improve peak table quality. On this step you can take the value of noise level. It can be calculated with nmrDraw or found in «pk.tcl» script (at the end).

4. Transfer assignment

If you have a table with assignment for your protein you can transfer it to peak table peaked in previous step. nmrPipe used the values of X_PPM and Y_PPM variable to transfer assignments, but all other procedures require the parameters in number of spectral points. Assignment could be easily transferred with routine «ipap.tcl». You have to provide the table with peaks (from step #3), the table with assignments, and the spectrum. This routine will set the ASS variable of table from step #3 and save it with new name. You can run this routine with command like this:

ipap.tcl -specName1 test.ft2 -inName1 test.tab -outName1 out.tab -assName assignment.tab -ndim 2 -single

5. Remove degenerate peaks

Usually you have assignment not for all picked peaks. Probably not all of them are peaks of interest. Therefore removing of them will save the time during fitting procedure. Peaks that are rather isolated from other and that are not interesting for analysis could be removed from peak table. Such manipulations with peaks could be done by running nmrDraw and entering the peak table editing mode: press Shift + 5. By pressing left mouse button you can pick the peak. Pressing of the right button on existing pick will remove it. Pressing the middle button on existing peak will change the value of variable that is drawn on the screen (usually it is INDEX but you can switch it to ASS in peak picking window (Shift + K)).

6. Sort table, renumber INDEX variable, remove CLUSTID and MEMCNT variables

nmrPipe provide a lot of scripts for manipulating the tables. You can find their description here:
http://spin.niddk.nih.gov/NMRPipe/ref/scripts/
Before running the fitting procedure it is worth to perform sorting and renumbering of peak table. D. Kohda wrote the set of scripts called «P-ROI». One of them is interesting to us since it could sort nmrPipe formatted table by ASS variable by the residue number but not alphabetically. This script called proiSort. You can download it here:

proiSort

Running is very simple:

proiSort inTable outTable

Be careful in naming peaks in ASS variable since all text except numbers in ASS variable is neglected during sorting.

INDEX variable could be renumbered by running command:

adjTab.tcl -in out.tab -out out.tab -set INDEX -expr '$loc + 1'

You can sort your table by any variable. For example by CLUSTID variable:

adjTab.tcl -in out.tab -out out.tab -sort CLUSTID

Next step will perform clusterization of peaks. Before running it you should manually remove variables CLUSTID and MEMCNT. Using this script you can easily remove them:

remClustidMemcnt.sh:

#!/bin/bash

IN_TABLE="out.tab"
OUT_TABLE="out_1.tab"

rm $OUT_TABLE

grep VARS $IN_TABLE | sed -e 's/ *  */ /g' | cut -d " " -f 1-24 >> $OUT_TABLE
grep FORMAT $IN_TABLE | sed -e 's/ *  */ /g' | cut -d " " -f 1-24 >> $OUT_TABLE
echo "" >> $OUT_TABLE

VARS=$(grep VARS $IN_TABLE | sed -e 's/ *  */ /g' | cut -d " " -f 2-24)
FORMAT=$(grep FORMAT $IN_TABLE | sed -e 's/ *  */ /g' | cut -d " " -f 2-24)

getTabCol.tcl -in $IN_TABLE -var $VARS -fmt $FORMAT >> $OUT_TABLE

7. Detect clusters

Peak clusterization greatly improve fitting accuracy. To join overlapping peaks in clusters use this command:

clustTab -in out.tab -out out_clust.tab -x 1 2 -dist 30 20

Do not forget to adjust region size by -dist attribute.

8. Get spectrum list

Complete list of spectra could be obtained by this command:

ls */test.ft2 | sort -n > spec.list

If you want to cut it, then use this command:

ls */test.ft2 | sort -n | sed -n 1,10p > spec1_10.list

9. Get timestamps

Unix timestamps from ser files are very useful in kinetics measurements, since they provide the precise time data. Use this command to grab timestamps (seconds from 0:00 January, 1, 1970):

Mac OS X

stat -f %m */ser > timestamps.txt

Linux

stat -c %Z */ser > timestamps.txt

10. Split spectra to clusters and fit with nlinLS

Splitting should be done because nlinLS could not be run on big number of peaks and spectra. Use this script to do that.

run_nlinLS_byClust.sh:

#!/bin/bash

IN_TABLE="out_clust.tab"
DX=8
DY=6
NOISE=200000

N=$(getTabCol.tcl -in $IN_TABLE -var CLUSTID | sort -g | uniq | wc -l)
M=$(wc -l < spec.list)

echo "Number of clusters: $N"
echo "Number of spectra: $M"
echo "Running seriesTab"

seriesTab -in $IN_TABLE -list spec.list \
          -dx $DX -dy $DY -max \
          -xzf $(expr $DX \* 2 + 1) -yzf $(expr $DY \* 2 + 1) \
          -out out_clust_series.tab -verb

rm -f nlinLS.tab nlinLS.log

grep VARS out_clust_series.tab >> nlinLS.tab
grep FORMAT out_clust_series.tab >> nlinLS.tab
echo "" >> nlinLS.tab

for (( i=1; $i<=$N; i++ ))
do
    echo "Fitting cluster #$i."

    rm -f out_clust_"$i"_series.tab
    grep VARS out_clust_series.tab >> out_clust_"$i"_series.tab
    grep FORMAT out_clust_series.tab >> out_clust_"$i"_series.tab
    echo "" >> out_clust_"$i"_series.tab
    awk "\$24 ~ /^$i$/ {print}" out_clust_series.tab >> out_clust_"$i"_series.tab


    
    nlinLS -in out_clust_"$i"_series.tab \
           -out nlinLS_clust_"$i".tab \
           -list spec.list \
           -mod GAUSS1D GAUSS1D SCALE1D -w $DX $DY $increment \
           -noise $NOISE -ppm -norm -iter 1000 -maxf 1000 \
           >> nlinLS.log

    sed '1,14d' nlinLS_clust_"$i".tab >> nlinLS.tab
    
done

It will create several out_clust_$i_series.tab files containing peaks in i-th cluster. Number of files is equal to number of clusters. Files nlinLS.log and nlinLS.tab will contain all log messages and resulting table with fitted parameters, respectively.

11. Extract volumes as columns to TXT file

If you want to extract volumes from nlinLS.tab (or similarly formatted), you need to extract all Z_$i variables, multiply them by VOL variable and transpose table. All of that is done with following script.

getVolumes.sh:

#!/bin/bash

TAB=$1

sed '1,3d' $TAB \
    | awk '{ for (i = 26; i <= NF; i++) $i = $i*$20; print }' \
    | cut -d " " -f 26- > $TAB.tmp

getTabCol.tcl -in $TAB -var ASS > 0.tmp
paste -d " " 0.tmp $TAB.tmp > nlinLS_VOL.tmp

awk '
{ 
    for (i=1; i<=NF; i++)  {
        a[NR,i] = $i
    }
}
NF>p { p = NF }
END {    
    for(j=1; j<=p; j++) {
        str=a[1,j]
        for(i=2; i<=NR; i++){
            str=str"\t"a[i,j];
        }
        print str
    }
}' nlinLS_VOL.tmp > $2

rm *.tmp