[K12OSN] OT -- Not LTSP, but Linux Scripting Question
Kevin Squire
gentgeen at linuxmail.org
Tue Mar 22 19:41:33 UTC 2005
Dear Petre,
Thank you greatly for your time. The script for checking for errors
will be VERY helpful. I know that when I wrote the orig. post, I had
some ideas, but thought it would be a bit clunky. (And it was, I
finished working on it last night -- that itch that had to be scratched)
I did add two lines to your error checker, for the line counter thing.
At the very begining I just added:
$linecount = 0;
while (<>) {
$linecount++;
It looks like I am going to have to look into perl scripting some more.
I will definitly be using your scripts as a starting point. Thanks
again.
And just as a comparison, here is what I found to work as a bash script.
The perl script is so much nicer. (removed all the code for "error
checks")
#!/bin/bash
# File to spilt Progress Report into multible files then combine them
# back together using a provided text file as the guide.
# Variable that I may need to change by hand,
# but don't want to have to do on the command line
outputsuffix="04-05_Mid"
basedir="/home/kevin/Documents/PAVCS/Progress_Reports"
# assign better names to the arguements provided on the command line
teachername=$1
outputdirname=$1
inputfile=$2
parsefile=$1.txt
outputdir=$basedir/$outputdirname
tempdir=$outputdir/temp
mkdir -p $tempdir
echo " Spliting $inputfile now please wait ...."
echo " "
pdftk "$inputfile" burst output "$tempdir/$teachername"_%03d.pdf
echo " $inputfile split. Now combining files ..."
echo " "
# Find the number of lines in a $parsefile
noline=`wc -l $parsefile | cut -d" " -f1`
# Need to repeat the combine step for each row of the $parsefile
x=1 # initalize x for my counter
while [ "$x" -le "$noline" ]; do
studentname=`sed -n ''$x'p' $parsefile | cut -d"," -f1`
stupg1=`sed -n ''$x'p' $parsefile | cut -d"," -f2`
stupg2=`sed -n ''$x'p' $parsefile | cut -d"," -f3`
stupg3=`sed -n ''$x'p' $parsefile | cut -d"," -f4`
stupg4=`sed -n ''$x'p' $parsefile | cut -d"," -f5`
echo " Preparing report for $studentname ..."
# Need to run a different pdftk command based on the number of pages
if [ "$stupg4" != "" ]; then
pdftk "$tempdir/$teachername"_"$stupg1".pdf \
"$tempdir/$teachername"_"$stupg2".pdf \
"$tempdir/$teachername"_"$stupg3".pdf \
"$tempdir/$teachername"_"$stupg4".pdf \
cat output "$outputdir/$studentname"_"$outputsuffix".pdf
elif [ "$stupg3" != "" ]; then
pdftk "$tempdir/$teachername"_"$stupg1".pdf \
"$tempdir/$teachername"_"$stupg2".pdf \
"$tempdir/$teachername"_"$stupg3".pdf \
cat output "$outputdir/$studentname"_"$outputsuffix".pdf
elif [ "$stupg2" != "" ]; then
pdftk "$tempdir/$teachername"_"$stupg1".pdf \
"$tempdir/$teachername"_"$stupg2".pdf \
cat output "$outputdir/$studentname"_"$outputsuffix".pdf
elif [ "$stupg1" != "" ]; then
echo " Every student should have at least 2 pages."
echo " $studentname has only 1 page, I will stop now."
echo " "
exit
else
echo " $studentname does not have any pages, I will stop now"
echo " "
exit
fi
echo " $studentname completed."
echo " "
x=`expr $x + 1` # Increase counter by 1
done
# Need to clean-up the temp directory
rm -rf $tempdir/
# This is just here for troubleshooting
echo " "
echo "$0 script ran successully for $teachername class"
On Tue, 22 Mar 2005 08:44:01 -0600
Petre Scheie <petre at maltzen.net> wrote:
> I suggest you create two scripts: one to check the text file for any
> errors--spaces, non-three digit numbers, whatever--and the other to
> actually do the pdftk stuff once you have a clean text file. I do
> alot of this kind of stuff and I've found that it is much more
> efficient to make sure you've got a clean file to work with, fix any
> problems beforehand, and then do your 'batch' process, than it is to
> try to write one script that will do merging for 50 lines, discover a
> problem, skip that line but be able to tell you it had a problem, or
> bail/die, in which case you have to fix the problem, start all over
> with the merging except you don't want to do the lines that you got
> through successfully on the first run, so you have to figure out where
> the error was... and so on; you get the idea. So, with that in mind,
> I wrote two quick & dirty perl scripts that should do most of what you
> want. It could probably be done in a shell script, but it would be
> harder (which is how perl came about).
>
> Script 1 looks for errors in the text file, as you described:
>
> #!/usr/bin/perl -w
> # script written for Keven Squire on the K12LTSP list
> # make sure no lines have any spaces
>
> $errors = 0;
> while (<>) {
> chomp;
> if ($_ =~ / /) { # check for spaces
> print "$_ has a space in it\n";
> $errors++;
> }
> (@array) = split(',',$_);
> for ($i=1; $i <= $#array; $i++) {
> if (length($array[$i]) != 3) { # check for numbers that
> aren't 3 digits
> print "Field $i on line $_ is not three digits\n";
> $errors++;
> }
> }
> # Uncomment the next line if you want to display a running tally of
> the errors# print "Errors is $errors\n";
> }
> ($errors > 0) && print "There were $errors errors\n";
>
> ########### end script1 ################
>
> Run this against your text file ('script1 textfile.txt') and it will
> tell you of any errors it finds and where. You could put in a line
> counter to make locating the errors a bit easier to get to. Fix the
> errors, run this script again, and repeat until you get no errors.
> Then run script 2:
>
> #!/usr/bin/perl -w
>
> while (<>) {
> $sourcefiles = "";
> chomp;
> (@array) = split(',',$_);
> for ($i=1; $i <= $#array; $i++) {
> $sourcefiles = $sourcefiles." ".$array[0]."_.".$i."pdf";
> }
> print "The input string will be $sourcefiles\n";
> # system("/path/to/pdftk $sourcefiles cat output $array[0].pdf");
> }
>
> For each line in the text file, this will split up the fields, create
> the pdf file names, and put it all into one string for use with the
> pdftk command. I have the last line commented out because you should
> do a dry run with this first to make sure the $sourcefiles string will
> be what you want. I don't have pdftk so I couldn't really test it,
> but the print command on the penultimate line will show what will be
> passed to pdftk. HTH
>
> Petre
>
>
> Kevin Squire wrote:
> > First, I apologize for the OT nature of the post, but I am sure many
> > of you will know / have done something like this. Also, I really
> > did not know where else to post the question. If you know somewhere
> > better, feel free to let me know. :-)
> >
> > The Asst. Prin. has asked me do something very tedious (I did set
> > myself up for it, but I could use the "brownie points"), and I need
> > some help with the script that I am writing to make it less tedious.
> > I have done
> > a fair bit of scripting, but nothing this advanced, so I need some
> > help.
> >
> > Some general info: Each teacher right now has a single MS Word
> > document with every one of his/her students progress reports. (i.e.
> > I have one file called squire_pr.doc that is 116 pgs for 56
> > students). The AP whats them to be a single document (2 or 3 pgs)
> > per student. He does not care if they stay in .doc format or not, as
> > long as they still look the same.
> >
> > So I have taken my squire_pr.doc and printed it to PDF
> > (squire_pr.pdf) so that I could use a program called pdftk (
> > http://freshmeat.net/projects/pdftk/ ) and split it up into 116
> > single page documents (each one called squire_###.pdf). Then I can
> > use the same program to join the appropriate pages back together
> > again (so squire_001.pdf and squire_002.pdf becomes
> > smithJ_04-05_mid.pdf).
> >
> > I want to put together a script that will automate this stuff (to a
> > certain point). The teacher sends me two files, the 1 large pdf
> > file and a text file with student name and the page numbers of the
> > PDF file that make that student's report. Usually it will be 2
> > pages, but sometimes it will be 3 or maybe even 4. The text file
> > would look something like:
> >
> > smithJ,001,002
> > mouseM,003,004
> > gatesW,005,006,007
> >
> > I have already done the basics on the script -- setting up
> > variables, assigning directories, making sure the correct files
> > exist already, etc.
> > But I don't know how to (1) get the script to read from the text
> > file,
> > (2) verify that the text file has now spaces and all numbers are in
> > ### format (3) assign variable to each field in the text file (4)
> > repeat for every line in the text file.
> >
> > Some info/example/notes from my script:
> > =============================================
> > $inputfile is the 1 big PDF file
> > $tempdir/$teachername_%03d.pdf part creates a bunch of single
> > PDF's
> > with the name squire_001.pdf, squire_002.pdf, etc.
> > $parsefile is the text file with student names, and page numbers
> > from the PDF that make up there report
> > ==========
> > pdftk $inputfile burst $tempdir/$teachername_%03d.pdf
> >
> > # Now the hard part :-)
> > # Need to read the $parsefile and verify that:
> > # there are no spaces and that all numbers are in ### format
> > # if not just give an error of $prasefile has error (adding a line
> > # number would be nice but not necessary
> > # and then assign the following:
> > # $studentname from field 1
> > # $stupg1 from field 2
> > # $stupg2 from field 3
> > # $stupg3 from field 4 for those that have 4 fields
> > # $stupg4 from field 5 for those that have 5 fields
> >
> > # then run the command 'pdftk INPUTFILES cat output COMBINEDFILE'
> > # for every single line in the text file
> > # where INPUTFILES would be $tempdir/$teachername_$stupgN.pdf where
> > # N could be 1,2,3 or 4 depending on what was found in text file
> >
> > NOTE -- sorry this got so long, I hope it all makes sense. And
> > Thank you in advance for you effort.
> >
> >
>
> _______________________________________________
> K12OSN mailing list
> K12OSN at redhat.com
> https://www.redhat.com/mailman/listinfo/k12osn
> For more info see <http://www.k12os.org>
--
More information about the K12OSN
mailing list