Handling of arrays

C

Clement Ow

A snippet of my code are as follows:
$selections = ["*","*"]
$file_exception = ["RiskViewer*","*.xls"]

$source = ["C:/Test", "C:/Test"]

$dest = ["U:/Test","U:/Test"]


sd_a=$source.zip($dest,$selections,$file_exception)

sd_a.each do |sd|
$source, $destination, $selections, $file_exception = sd
src = File.join $source, $selections
puts src
d= $d1
dst= File.join $destination, d
test = File.join $source, $file_exception
src1 = Dir.glob(src) - Dir.glob(test)

Dir.glob(src1) do |file|
FileUtils.mv file, dst

I'm developing a script to move files to the dest paths. But however, i
only can put one exception for each file path, but sometimes on some
scenarios I'll need to to have 2 exceptions in one file path, hence the
above code.
But the problem with the code is that it executes it twice and by doin
that moves everything already except for the riskviewer files. And on
the 2nd time it runs, all the files have been moved already.

So is there any way whereby the script will check for 2 or more
exceptions before executing the move command, presume that we still use
arrays? Thanks in advance for any help rendered!
 
J

Jesús Gabriel y Galán

A snippet of my code are as follows:
$selections = ["*","*"]
$file_exception = ["RiskViewer*","*.xls"]

$source = ["C:/Test", "C:/Test"]

$dest = ["U:/Test","U:/Test"]


sd_a=$source.zip($dest,$selections,$file_exception)

sd_a.each do |sd|
$source, $destination, $selections, $file_exception = sd
src = File.join $source, $selections
puts src
d= $d1
dst= File.join $destination, d
test = File.join $source, $file_exception

Untested but change this:
src1 = Dir.glob(src) - Dir.glob(test)

to:

src1 = $file_exception.inject(Dir.glob(src)) {|result, ex| result -
Dir.glob(ex)}

Also, doing a Dir.glob for each exception can be a lot, why not use
regular expressions
to remove from the result of the first glob? You will have to change
the exceptions
a little bit, but it might be worth it:

["\.txt", "\.sql"].inject(Dir.glob("/home/jesus/*")) {|result, ex|
result.reject{|x| x =~ Regexp.new(ex)}}

This removes from my home folder all files that match ".txt" and ".sql"

Hope this helps,

Jesus.
 
C

Clement Ow

Jesús Gabriel y Galán said:
sd_a.each do |sd|
$source, $destination, $selections, $file_exception = sd
src = File.join $source, $selections
puts src
d= $d1
dst= File.join $destination, d
test = File.join $source, $file_exception

Untested but change this:
src1 = Dir.glob(src) - Dir.glob(test)

to:

src1 = $file_exception.inject(Dir.glob(src)) {|result, ex| result -
Dir.glob(ex)}

Also, doing a Dir.glob for each exception can be a lot, why not use
regular expressions
to remove from the result of the first glob? You will have to change
the exceptions
a little bit, but it might be worth it:

["\.txt", "\.sql"].inject(Dir.glob("/home/jesus/*")) {|result, ex|
result.reject{|x| x =~ Regexp.new(ex)}}

This removes from my home folder all files that match ".txt" and ".sql"

Hope this helps,

Jesus.

Hi Jesus,
nice slick code there using regexp ;) just wondering if there would be a
possibility of having a different set of file exceptions for different
file paths. cause at the moment, there can only be one specific file
exceptions for every file path, which might be hard in the event where
we would need to cater to different source paths. Prolly a few arrays in
file exception?
 
J

Jesús Gabriel y Galán

Jes=FAs Gabriel y Gal=E1n said:
sd_a.each do |sd|
$source, $destination, $selections, $file_exception =3D sd
src =3D File.join $source, $selections
puts src
d=3D $d1
dst=3D File.join $destination, d
test =3D File.join $source, $file_exception

Untested but change this:
src1 =3D Dir.glob(src) - Dir.glob(test)

to:

src1 =3D $file_exception.inject(Dir.glob(src)) {|result, ex| result -
Dir.glob(ex)}

Also, doing a Dir.glob for each exception can be a lot, why not use
regular expressions
to remove from the result of the first glob? You will have to change
the exceptions
a little bit, but it might be worth it:

["\.txt", "\.sql"].inject(Dir.glob("/home/jesus/*")) {|result, ex|
result.reject{|x| x =3D~ Regexp.new(ex)}}

This removes from my home folder all files that match ".txt" and ".sql= "

Hope this helps,

Jesus.

Hi Jesus,
nice slick code there using regexp ;) just wondering if there would be a
possibility of having a different set of file exceptions for different
file paths. cause at the moment, there can only be one specific file
exceptions for every file path, which might be hard in the event where
we would need to cater to different source paths. Prolly a few arrays in
file exception?

If what you want is to associate different information to each source path,
I would look into a hash of hashes or a hash of arrays, or a hash of struct=
s,
where you could have a complex object for the info related to an entity of
your program. For example:

A hash of arrays, if you only need exceptions:

exceptions_by_path =3D {}
exceptions_by_path["/home/jesus"] =3D ["\.txt", "\.sql"]
exceptions_by_path["/home/jesus/applications"] =3D ["\.sh", "\.bin"]
# [...]

and then

paths.each do |path|
files =3D exceptions_by_path.inject(Dir.glob("#{path}/*")) {|result,
ex| result.reject{|x| x =3D~ Regexp.new(ex)}}
end

If for a path you need several different things, I would go with a
Struct or a custom class:

FileInfo =3D Struct.new :exceptions, :eek:ther_value, :yet_another
paths_info =3D {}
paths_info["/home/jesus"] =3D FileInfo.new(["\.txt", "\.sql"], "other
value", "another")

and then access the exceptions array as paths_info["/home/jesus"].exception=
s
and use it as before.

Hope this helps,

Jesus.
 
C

Clement Ow

Jesús Gabriel y Galán said:
Hope this helps,

Jesus.

thanks Jesus, that was really helpful! However I altered the code alil
cause ruby rendered an error to me saying that it cant convert Array
into string (hmmm, dunno if that's normal) for this line(suppose it's
for the arrays in the hash, exceptions_by_path):
files = exceptions_by_path.inject(Dir.glob("#{path}/*")) {|result,
ex| result.reject{|x| x =~ Regexp.new(ex)}}

So i decided to name my arays in the hash, file_exception[0],
file_exception[1] ... and used this instead with an incremental value:
i = 0
src1 = $file_exception.inject(Dir.glob(src)) {|result, >>ex|result.reject{|x| x =~ Regexp.new(ex)}}
i = 1 + i


But i had the problem of accidentally keyin in the a wrong path name and
nothing was shown for the 2nd source path.(Only after troubleshooting
for 1 whole hr did i realise) So is there any way where by we can have a
condition something like, if result == nil puts "wrong pathname"(just an
idea, cos i tried and it doesnt work)

And btw dont get me wrong, Struct is sweet too, just that i didnt want
to have a fixed number of exceptions for each path. ;)

Thanks again.
 
C

Clement Ow

Also, I have problems making some file exceptions. For example, i ahve
some files that start with 2007 which goes something like, 20070131 and
a file which is called "Risk 20070131" but when i put 2007 in the
file_exception variable, it deselects any file that has 2007 in the
filename, which not what i want. I know that the regexp doesnt allow any
like 2007* to be put in the file_exception to deselect any file which
starts with 2007. Any ideas at all, anyone?

Regards
 
J

Jesús Gabriel y Galán

thanks Jesus, that was really helpful! However I altered the code alil
cause ruby rendered an error to me saying that it cant convert Array
into string (hmmm, dunno if that's normal) for this line(suppose it's
for the arrays in the hash, exceptions_by_path):

ex| result.reject{|x| x =3D~ Regexp.new(ex)}}

Sorry, that's what I get for not testing the code. I think (I'm a bit
dense right
now, so this might not work) that what I meant was this:

exceptions_by_path =3D {}
exceptions_by_path["/home/jesus"] =3D ["\.txt", "\.sql"]
exceptions_by_path["/home/jesus/applications"] =3D ["\.sh", "\.bin"]
# [...]

and then

paths.each do |path|
files =3D exceptions_by_path[path].inject(Dir.glob("#{path}/*")) {|result=
,
ex| result.reject{|x| x =3D~ Regexp.new(ex)}}
end

so you get the exceptions for that path, which is an Array that should
work with
inject as I was expecting.
And btw dont get me wrong, Struct is sweet too, just that i didnt want
to have a fixed number of exceptions for each path. ;)

FileInfo =3D Struct.new :exceptions, :eek:ther_value, :yet_another
paths_info =3D {}
paths_info["/home/jesus"] =3D FileInfo.new(["\.txt", "\.sql"], "other
value", "another")

The number of exceptions is not fixed, as you can see above,
I am storing an array in the :exceptions field, so each path can
have a different number of exception patterns.

Jesus.
 
J

Jesús Gabriel y Galán

Also, I have problems making some file exceptions. For example, i ahve
some files that start with 2007 which goes something like, 20070131 and
a file which is called "Risk 20070131" but when i put 2007 in the
file_exception variable, it deselects any file that has 2007 in the
filename, which not what i want. I know that the regexp doesnt allow any
like 2007* to be put in the file_exception to deselect any file which
starts with 2007. Any ideas at all, anyone?

I'm not sure if I'm understanding you correctly, but you can tweak the
regexps so that they actually match what you want. If you want
to deselect only the files that start with 2007 you can do this:

irb(main):001:0> a = %w{20071212 20073445 risk20072341}
=> ["20071212", "20073445", "risk20072341"]
irb(main):003:0> a.reject {|x| x =~ /\A2007/}
=> ["risk20072341"]

Jesus.
 
C

Clement Ow

Sorry, that's what I get for not testing the code. I think (I'm a bit
dense right
now, so this might not work) that what I meant was this:

Nah, it's fine, we all know it can get a lil tiring sometimes. As it is
with this script im writing lol.
files = exceptions_by_path[path].inject(Dir.glob("#{path}/*"))
{|result,
ex| result.reject{|x| x =~ Regexp.new(ex)}}
end
so you get the exceptions for that path, which is an Array that should
work with
inject as I was expecting.

Anw, I get what you mean, but because Im using a config file to hold all
the source paths, dest paths and the file exceptions, which is keyed in
by the user, I dont wanna make it too complicated to fill in the paths,
so i decided to just use numbers to name the arrays in the hash.
I'm not sure if I'm understanding you correctly, but you can tweak the
regexps so that they actually match what you want. If you want
to deselect only the files that start with 2007 you can do this:

irb(main):001:0> a = %w{20071212 20073445 risk20072341}
=> ["20071212", "20073445", "risk20072341"]
irb(main):003:0> a.reject {|x| x =~ /\A2007/}
=> ["risk20072341"]

Oh yea, i did try this but it doesnt work, somehow *scratches head* It
just shows all files to be moved, and it obviously did not carry out the
exceptions. But if it's done this way, there wont be any point in keying
in the various different exceptions alr, hence i got stuck. :/
 
C

Clement Ow

I'm not sure if I'm understanding you correctly, but you can tweak the
regexps so that they actually match what you want. If you want
to deselect only the files that start with 2007 you can do this:

irb(main):001:0> a = %w{20071212 20073445 risk20072341}
=> ["20071212", "20073445", "risk20072341"]
irb(main):003:0> a.reject {|x| x =~ /\A2007/}
=> ["risk20072341"]

Oh yea, i did try this but it doesnt work, somehow *scratches head* It
just shows all files to be moved, and it obviously did not carry out the
exceptions. But if it's done this way, there wont be any point in keying
in the various different exceptions alr, hence i got stuck. :/

Oh i realise what was wrong in the matching of this regexp, because in:
files = exceptions_by_path[path].inject(Dir.glob("#{path}/*"))
{|result,
ex| result.reject{|x| x =~ Regexp.new(ex)}}
end
After much playing around with this statement,result here is the whole
path name, which apparently doesnt match the filename itself, hence
making /\A2007/ not able to work. (prolly need to use
File.basename(result)) But can you do me a favour by explaining what
this statement means as I dunno how come some variables assigned to
certain commands etc.? Thanks!
 
C

Clement Ow

Clement said:
Oh yea, i did try this but it doesnt work, somehow *scratches head* It
just shows all files to be moved, and it obviously did not carry out the
exceptions. But if it's done this way, there wont be any point in keying
in the various different exceptions alr, hence i got stuck. :/

Oh i realise what was wrong in the matching of this regexp, because in:
files = exceptions_by_path[path].inject(Dir.glob("#{path}/*"))
{|result,
ex| result.reject{|x| x =~ Regexp.new(ex)}}
end
I found what is actually goin on in this statement. because x is the
whole path name eg. //sins1234/home/file_name the regexp, /\A2007/
doesnt match the beginning of the string. And also the exceptions array
must have "\\A2007" when passing it into Regexp.new. So my code now
looks like this:
src1 = $file_exception.inject(Dir.glob(src)) {|result, ex|result.reject
{|x| File.basename(x) =~ Regexp.new(ex, Regexp::IGNORECASE)}}
Dir.glob(src1).each do |file|
#do sth
end

After much playing around with this statement,result here is the whole
path name, which apparently doesnt match the filename itself, hence
making /\A2007/ not able to work. (prolly need to use
File.basename(result)) But can you do me a favour by explaining what
this statement means as I dunno how come some variables assigned to
certain commands etc.? Thanks!


However, for education sake, do u mind explaining how the whole inject
statement works? thanks! ;)
 
J

Jesús Gabriel y Galán

Clement Ow wrote:
However, for education sake, do u mind explaining how the whole inject
statement works? thanks! ;)

Enumerable#inject is a very powerful iterator (in my opinion at
least). What it does is
iterate over all elements in an enumerable, yielding to the block and
accumulator
and the next element in the enumerable. The accumulator then gets updated by
the result of the block, so the next iteration will be yielded that
value. If you specify
a parameter to inject, that will be the first accumulator. If not, the
first element
of the enumerable is used instead. Some examples:

irb(main):003:0> [1,2,3].inject(0) {|total,x| p [total, x]; total + x}
[0, 1]
[1, 2]
[3, 3]
=> 6
irb(main):004:0> [1,2,3].inject {|total,x| p [total, x]; total + x}
[1, 2]
[3, 3]
=> 6

Another one (although this is just to show how inject works, cause
the functionality would be better achieved by map):

irb(main):011:0> [1,2,3,4,5].inject([]) {|total,x| p [total,x]; total + [x**2]}
[[], 1]
[[1], 2]
[[1, 4], 3]
[[1, 4, 9], 4]
[[1, 4, 9, 16], 5]
=> [1, 4, 9, 16, 25]

The p [total,x]; helps in showing what gets passed to the block each time.
Just remember: the result of the block will be the next "total".

In our case, the result of the block was the original array minus the files
that matched the exceptions. So each time that array was injected (well,
a copy) along with the next exception, and the result of the block would
be another array with less elements, etc.

Hope this helps,

Jesus.
 
J

Jesús Gabriel y Galán

The accumulator then gets updated by
the result of the block, so the next iteration will be yielded that
value.

I have realized that this sentence can be confusing: the accumulator doesn'=
t
get updated. The next value for the accumulator will be the result of the b=
lock,
not necesarily the same object.

I have read many times that you shouldn't use the same accumulator by
applying destructive methods to it, but I can't remember what the pros and
cons were. So this should not be done:

irb(main):012:0> [1,2,3].inject([]) {|total,x| total << x**2}
=3D> [1, 4, 9]

Instead you should do this:

irb(main):013:0> [1,2,3].inject([]) {|total,x| total + [x**2]}
=3D> [1, 4, 9]

Maybe someone can chime in and explain this a little bit better?

Jesus.
 
R

Robert Klemme

2008/5/16 Jes=FAs Gabriel y Gal=E1n said:
I have realized that this sentence can be confusing: the accumulator does= n't
get updated. The next value for the accumulator will be the result of the= block,
not necesarily the same object.
Correct.

I have read many times that you shouldn't use the same accumulator by
applying destructive methods to it, but I can't remember what the pros an= d
cons were.

Do you remember where you read that?
So this should not be done:

irb(main):012:0> [1,2,3].inject([]) {|total,x| total << x**2}
=3D> [1, 4, 9]

Instead you should do this:

irb(main):013:0> [1,2,3].inject([]) {|total,x| total + [x**2]}
=3D> [1, 4, 9]

Maybe someone can chime in and explain this a little bit better?

Sorry, but this is nonsense. It's completely safe and even reasonable
to reuse an accumulator value. Your second solution creates new
Arrays all the time and then throws them away. It is much more
efficient to use Array#<< as in your first example.

If, of course the original accumulator value must not be changed
because side effects will do harm, then of course you cannot modify it
but need to create new objects. But in the scenario above, where the
Array is solely created for #inject it is the most reasonable thing to
directly append.

Kind regards

robert

--=20
use.inject do |as, often| as.you_can - without end
 
J

Jesús Gabriel y Galán

2008/5/16 Jes=FAs Gabriel y Gal=E1n <[email protected]>:

Do you remember where you read that?

No, I probably misunderstood something.
So this should not be done:

irb(main):012:0> [1,2,3].inject([]) {|total,x| total << x**2}
=3D> [1, 4, 9]

Instead you should do this:

irb(main):013:0> [1,2,3].inject([]) {|total,x| total + [x**2]}
=3D> [1, 4, 9]

Maybe someone can chime in and explain this a little bit better?

Sorry, but this is nonsense. It's completely safe and even reasonable
to reuse an accumulator value. Your second solution creates new
Arrays all the time and then throws them away. It is much more
efficient to use Array#<< as in your first example.

Yep, I saw that and that's why I refused to even try to explain it :)
If, of course the original accumulator value must not be changed
because side effects will do harm, then of course you cannot modify it
but need to create new objects.

This might be what I had in mind.
But in the scenario above, where the
Array is solely created for #inject it is the most reasonable thing to
directly append.

It's clear that the example injecting a newly created array makes the
above explanation even worse :).

Thanks !

Jesus.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,244
Latest member
cryptotaxsoftware12

Latest Threads

Top