SyncEnumerator?

E

Emmanuel Touzery

--Boundary-00=_IGG8AgQIywgLsRv
Content-Type: Multipart/Mixed;
boundary="Boundary-00=_IGG8AgQIywgLsRv"


--Boundary-00=_IGG8AgQIywgLsRv
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Hello,

Just to share an experience of something that happened to me today; i wrote a
script and it took all the memory of my computer and I had no idea why...
Then after many hours, I had a flash.. Maybe SyncEnumerator was guilty, since
it (does it?) seems to use continuations...
Well.. in my code I actually could change the use of SyncEnumerator by an
Array#zip.. the result: the calculation took about no memory (instead of
several hundred megabytes) and finished almost instantly (as it always should
have)!

So, maybe just a warning: SyncEnumerator is not as cheap as it seems: prefer
Array#zip, and at the moment I can't think of a reason when SyncEnumerator
brings something over Array#zip since Array#zip is so much faster...

Attached a script.. it doesn't actually work to the end (the script is
working on data that I can't send here), but you can see the difference if
running with SyncEnumerator or Array#zip. Amazing!

At line 12 in the script:
# s = SyncEnumerator.new(persons, values)
s = persons.zip(values)

you can switch the comment between the two lines to see the change.

my conclusion for now: maybe SyncEnumerator will be one day more readable than
Array#zip, but in current implementations of ruby, it's way too slow..

emmanuel
--Boundary-00=_IGG8AgQIywgLsRv
Content-Type: text/plain;
charset="us-ascii";
name="raz.cvs"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="raz.cvs"

198 199 200 201 202 203 204 205 206 207
col1 M M M M M M M M M M
col2 25 24 25 23 25 22 20 18 25 28
col3 aaaaaaaaaaa aaaaaaa aaaaaa aa=C5=A1aaa =C5=A1aaaaaa aaaaaaaaaaa =C5=A1=
aaaaaa aaaaaaaaaaa a.a. aaaaaaa aaaaa
col4 b b b b b b b b.o. a b
col5 4 4 3 4 3 5 3 3 3 2
M N S N N N M M N M
aaaaa - - aa=C4=8Daaa aaaa=C5=A1a - - - - - aaaaaaa, aaaaaaa aaaaaaaaaa aaa=
aa -
=09
=09
=09
198 199 200 201 202 203 204 205 206 207
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
208 209 210 211 212 213 214 215 216 217
col1 M M M M M M =C5=BD =C5=BD M =C5=BD
col2 14 25 24 25 22 17 25 23 25 20
col3 aaaaaaa=C5=A1aaaa aaaaaaaaa aaaaaa aaaaaaa aaaaaaaaaaa aaaaaaaa aaaaaa=
a aaaaa aaaaaaa =C5=A1aaaaaaaa =C5=A1aaaaaa =C5=A1aaaaaaaa
col4 b a b b a a b b b b
col5 4 4 2 4 5 3 2 5 3 3
N N S N N N N N N N
208 209 210 211 212 213 214 215 216 217
208 209 210 211 212 213 214 215 216 217
=09
=09
=09
=09
218 219 220 221 222 223 224 225 226 227
col1 izlo=C4=8Den, opisan primer 38 letnega =C5=BD M M =C5=BD =C5=BD M M M =
=C5=BD
col2 21 25 25 25 19 20 20 18 24
col3 =C5=A1aaaaaaaa =C5=A1aaaaaa aaaaaaaa aaaaaa aaaaaaa aaaaaaaaa aaaaaa =
aaaaa=C4=8Daaa aaaaaaa =C5=A1aaa aaa=C4=8Daaa aaaaaaa =C5=A1aaa aaaa=C4=8Da=
aa aaaaaaaaaa =C5=A1aaaaaaaa
col4 b b a b b b a a b
col5 4 4 4 3 4 2 3 2 5
218 219 220 221 222 223 224 225 226 227
M N S M M M M
218 219 220 221 222 223 224 225 226 227
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
228 229 230 231 232 233 234 235 236 237
col1 M M M M M M M M =C5=BD M
col2 25 22 25 23 23 20 25 27 22 24
col3 aaaaaa=C4=8Daa aaaaaaa-aaaaaaaa aaaaaaaaaaa aaaaaaa aaaaaaaa aaaaaa aa=
aaaaaaaaa aaaaaaa=C4=8Daa aaaaaa aaaaa =C5=A1aaaaaaaa aaaaa aaaaaa
col4 b b.o. b a b b b b a b
col5 2 3 3 4 3 4 3 3 5 3
M M M N E M N N N S
228 229 230 231 232 233 234 235 236 237
228 229 230 231 232 233 234 235 236 237
=09
=09
=09
=09
238 239 240 241 242 243 244 245 246 247
col1 a =C5=BD a a aaaa=C4=8Daa, aaaaaa aaaaaa aa aaaaaaa aaaa=C4=8Daa, aaaa=
aaaaaaaaa aaaaa a a a a
col2 25 24 20 19 25 21 21 18
col3 =C5=BEaaaaaaaa a.a. a.a. aaaaa =C5=A1aaaaaa aaaaaaa =C5=A1aaa aaaaa =
aaaaa
col4 a b b b b b a b
col5 3 do4 2 3 2 4 3 3 do 4 4
238 239 240 241 242 243 244 245 246 247
E M N E S N E M
238 239 240 241 242 243 244 245 246 247
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
248 249 250 251 252 253 254 255 256 257
col1 =C5=BD M M M =C5=BD M M M M M
col2 21 24 22 25 25 21 24 22 19 24
col3 aaaaaaaaa aaaaaa aaaa aaaaaaa =C5=A1aaaaaa a.a. =C5=A1aaaaaaaa =C5=A1a=
aaaaa aaaaaaaaaa =C5=A1aaaaaa aaaaa aaaaaaaa aaaaaa
col4 b.o. brez poklica b a b b b.o. b a b
col5 4 2 4 3 5 4 do 5 3 4 4 2
M M M M M N S E N M
248 249 250 251 252 253 254 255 256 257
=09
=09
=09
=09
=09
=09
=09
258 259 260 261 262 263 264 265 266 267
col1 M M M M M M M M M M
col2 17 21-22 23 25 23 29 28 20 26 21
col3 aaaaaaa=C5=A1aaaa aaaaaaa aaaaaaaaaaa aaaaaaa, aaaaaaaa aa aaaa aaaaaa=
aaa =C5=A1aaaaaa aaaaaaa=C4=8Daa aaaaa aaaaaaa=09
col4 b b b b.o. b a b b b a
col5 1-6razred 5,7-8 pa2 3 3 3 5 3 3 3 3 2
M M N S E S N S M S
258 259 260 261 262 263 264 265 266 267
258 259 260 261 262 263 264 265 266 267
=09
=09
268 269 270 271 272 273 274 275 276 277
col1 M M M M M M M =C5=BD M M
col2 22 20 21 22 22 24 21 17 25 15
col3 aa=C4=8Daaaaaa=C5=A1aa aaaaaa aaaaaaa aaaaa aa =C5=A1aaa aaaaa aa =C5=
=A1aaa aaaaa aaaaa=C4=8Daa aaaaaa aaaaaaaa aaaaaa aa aaaaa =C5=A1aaa aaa=C4=
=8Daaa a=C5=A0 aa aaaaa =C5=A1aaa
col4 b b b b b b b b b b
col5 b.o. 3 5 3 2 3 3 3 2 3
268 269 270 271 272 273 274 275 276 277
M M N N M N M N S E
268 269 270 271 272 273 274 275 276 277
=09
=09
=09
=09
278 279 280 281 282 283 284 285 286 287
col1 M =C5=BD =C5=BD =C5=BD M M M M M M
col2 21 22 18 26 23 21 24 23 25 24
col3 aa aaaaa =C5=A1aaa aa aaaaa =C5=A1aaa aa aaaaa =C5=A1aaa aa aaaaa =C5=
=A1aaa aaaaaaaa aaaaaa aaaaaaa aa aaaaa =C5=A1aaa aaaaaaaaaaa aaaaaa a.a. a=
aaa a aaaaaaaa a aaaaaaaa, aaa=C4=8Daaa aaaaaa
col4 b b b b b b b b b b
col5 3 5 b.o. 5 3 ali4 3 4 5 2 4
278 279 280 281 282 283 284 285 286 287
S N M E N M E N N M
278 279 280 281 282 283 284 285 286 287
=09
=09
=09
=09
=09
=09
288 289 290 291 292 293 294 295 296 297
col1 M M M M M M M M M M
col2 18 20 23 25 21 25 21 23 20 27
col3 aaaaa aa =C5=A1aaa a.a. aaaaaaaaaa aaaaaa aa aaaaa =C5=A1aaa aaaaaa aa=
aaa aaa=C4=8Daa a=C5=A0 aaaaa aa =C5=A1aaa aaaaaa
col4 b b b b.o. a b b a b b
col5 3 3 4 b.o. 3 b.o. 3 2 4 3
288 289 290 291 292 293 294 295 296 297
M M N M E M N M N N
288 289 290 291 292 293 294 295 296 297
=09
=09
=09
298 299 300 301 302 303 304 305 306 307
col1 M M M M M M M M M M
col2 25 25 18 20 21 28 22 25 22 24
col3 aaaaaaaaa aaaaaa aaaaaaa aaaaaaaaaaa aaaaa aa aaaaa =C5=A1aaa aaaaaa a=
=2Ea. aaaa aaaaaaa a.a. =C5=BEaaaaaaa aaaaaa aaaaaaaaa aaaaaa
col4 b b b b b b a a b b
col5 3 2 zadosten 5 3 3 2 2 2 4
N S N N E M M E S N
298 299 300 301 302 303 304 305 306 307
=09
=09
=09
=09
=09
=09
=09
=09
=09
308 309 310 311 312 313 314 315 316 317
col1 M M M M M M =C5=BD M =C5=BD ni ankete
col2 25 24 18 23 25 24 23 22 21=09
col3 aaaa=C4=8Daaaa=C4=8Daa a.a. aa aaaaa =C5=A1aaa aaaaa aaaaaa: aaaaaaa a=
aaaaaa aaaaaaaa aa aaaaa =C5=A1aaa aa =C5=A1aaa=09
col4 b b b b b b b b b=09
col5 3 2 5 3 3 4 4 5 2=09
N S M N N M M M E=09
308 309 310 311 312 313 314 315 316 317
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
318 319 320 321 322 323 324 325 326 327
col1 M M M =C5=BD M M M M M ni ankete
col2 19 21 26 20 20 23 23 24 23=09
col3 aaaaaaaaaaaaaaa aaaaaaa aaaaa aaaaaaaaa aa aaaaa =C5=A1aaa aaaaa aaaaa=
aa a aaaaaaa aaaaaa a.a. aaaaa=09
col4 b a b b b b b b b=09
col5 4 2 3 5 3 2 3 2 b.o.=09
M N N M M N E E M=09
318 319 320 321 322 323 324 325 326 327
=09
=09
=09
=09
=09
=09
=09
=09
=09
328 329 330 331 332 333 334 335 336 337
col1 M =C5=BD M =C5=BD M M =C5=BD =C5=BD M M
col2 24 23 24 17 22 22 21 24 19 24
col3 aa aaaaa =C5=A1aaa aa aaaaa =C5=A1aaa a.a. aa aaaaa =C5=A1aaa aaaaaaaa=
aaaaaaaa a.a. aa aaaaa =C5=A1aaa aaaaa-aaaa=C5=A1=C4=8Daa aaaaaa
col4 b b b.o. b.o. b b b b a b
col5 4 5 4 4 3 b.o. 4 4 3 4
N N N M N M N N N E
328 329 330 331 332 333 334 335 336 337
=09
=09
=09
=09
=09
338 339 340 341 342 343 344 345 346 347
col1 M M =C5=BD M =C5=BD =C5=BD =C5=BD M =C5=BD M
col2 24 17 20 24 18 25 21 15 16 20
col3 aa aaaaa =C5=A1aaa aa aaaaa =C5=A1aaa aa aaaaa =C5=A1aaa aaaaaaaaaaa a=
a aaaaa =C5=A1aaa aaaaaaaa aa aaaaa =C5=A1aaa aaaaa aa =C5=A1aaa aaaaa aa =
=C5=A1aaa aaaaa aa =C5=A1aaa
col4 b b b b b b b b b b
col5 4 3 4 3 4 3 5 3 4 5
M N M M M E S M N M
338 339 340 341 342 343 344 345 346 347
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
=09
348 349 350 351 352 353 354 355 356 357
col1 M =C5=BD M M =C5=BD M M M M M
col2 20 22 25 24 20 24 23 20 20 23
col3 aa aaaaa =C5=A1aaa aaaaa aa =C5=A1aaa aaaaaaaaa aaaaaa aaaaaaa a.a. aa=
=C4=8Daa aaaaaaaaaa aaaa aaaaaaaaa=C5=A1aaa a aaaaaaa aa aaaaa =C5=A1aaa aa=
aaaaa =C5=A1aaa
col4 b b b b b b b b b a
col5 5 5 3 3 4,5 3 3 4 2 3
E N N S N N N S N E
348 349 350 351 352 353 354 355 356 357
348 349 350 351 352 353 354 355 356 357
=09
=09
=09
=09
358 359 360 361 362 363 364 365 366 367
col1 M M =C5=BD M ni ankete M M M =C5=BD M
col2 22 20 18 17 20 23 22 20 25
col3 aaaaa a.a. aa aaaaa =C5=A1aaa aa aaaaa =C5=A1aaa aaaaaaaa aaaaa aaaaa=
aaaa aaaaaa a.a. aaaaaaa aaaaaaa
col4 a b b b b a b b a
col5 b.o. 3 5 3 3 3 3 2 2do 3
M E N M M N M M E
358 359 360 361 362 363 364 365 366 367
=09
=09
368 369 370 371 372 373 374 375 376 377
col1 =C5=BD =C5=BD =C5=BD M M M M =C5=BD M M
col2 18 20 23 18 25 21 22 15 23 24
col3 aa aaaaa =C5=A1aaa aaaaa aa =C5=A1aaa aa aaaaa =C5=A1aaa aa aaaaa =C5=
=A1aaa aaaa.aaaaaaaaa aa aaaaa =C5=A1aaa aa aaaaa =C5=A1aaa aa aaaaa =C5=
=A1aaa aa aaaaa =C5=A1aaa aaaaaaaaaaa aaaaaa
col4 b b b a b b b a b a
col5 4 5 3 2 5 3 4 4 3 4
N N D M M M N E N E
368 369 370 371 372 373 374 375 376 377
=09
=09
=09
=09
=09
=09
=09
378 379 380 381 382 383 384 385 386 387
col1 M M =C5=BD M M M M M M M
col2 25 20 20 23 23 22 25 19 25 24
col3 aaaaaaa aaaaaa aa aaaaa =C5=A1aaa aa aaaaa =C5=A1aaa aa aaaaa =C5=A1aa=
a aaaaa aaaaaaaaaaa aaaaaaaaa aa=C5=A1aaaaaaa aaaaa aa =C5=A1aaa aaaaaaaaaa=
a aaaaa
col4 b b b b b b a b a b
col5 4 4 4 3 do 4 2 2 3 3 3 4
M M M N N S N N M M
378 379 380 381 382 383 384 385 386 387
=09
=09
=09
=09
=09
=09
=09
=09
388 389 390 391 392 393 394 395 396 397
col1 ni ankete =C5=BD =C5=BD M M M M izlo=C4=8Den, anketiran 40 letni mo=C5=
=A1ki M napaka
col2 22 20 24 21 21 18 21=09
col3 aa aaaaa =C5=A1aaa aa aaaaaaa =C5=A1aaa aaaaaaaaaaaaa aaaaa aaaa aaa =
aaaaaaa aa aaaaa =C5=A1aaa aaaaaaa=09
col4 b b a b b b b=09
col5 4 4 5 3 5 3 3=09
b.o. M N M N N M=09
388 389 390 391 392 393 394 395 396 397
388 389 390 391 392 393 394 395 396 397
=09
=09
=09
=09
398 399 400 401 402 403 404 405 406 407
col1 a a aaaaaa a aaaaaa aaaa=C4=8Daa, aaaa aaaaaaaaa aaaaa aaaa=C4=8Daa-aa=
aaa aaaa aaaa=C4=8Daa-aaaaa aaaa aaaa=C4=8Daa-aaaaa aaaa a
col2 23 25 24 21
col3 aaaaaaaa aaaaaaaa aaaaaa aaaaaaa =C5=A1aaaaaa-aaaaa aa =C5=A1aaa
col4 a b b b
col5 3 3 3 5
M M N N
398 399 400 401 402 403 404 405 406 407
=09
=09
=09
=09
408 409 410 411 412 413 414 415 416 417
col1 =C5=BD M M M M M M M =C5=BD M
col2 24 25 20 22 25 23 21 21 23 25
col3 aaaa=C5=BEaaaaa aaaaaa aaaaaaa aaaa aaaaaaa aaaaa aa =C5=A1aaa-=C5=A1a=
aaaaa aaaaaaaa aaaaaa aaaaa=C4=8Daa aaaaaa a.a. aaaaa aa =C5=A1aaa-=C5=A1aa=
aaaa aaaaaaaaa aaaaaa aaaaaa
col4 b b brez poklica b b b b b b b
col5 5 3 2 5 3 4 3do4 b.o. 4 2
N M M S N E M S E S
408 409 410 411 412 413 414 415 416 417
=09
=09
=09
=09
=09
=09
=09
418 419 420 421 422 423 424 425 426 427
col1 =C5=BD =C5=BD =C5=BD M M M =C5=BD M M M
col2 16 21 25 20 25 24 21 20 24 21
col3 aaaaa aa =C5=A1aaa-aaaaaaaaa aaaaa aa =C5=A1aaa aaaaa aa =C5=A1aaa aaa=
aa aa =C5=A1aaa =C5=A1aaaaaa aaaaaaa aaaaaa aaaaaaa=C4=8Daaa a=C5=A0 aaaaaa=
aaaaa aaaaaaa aaaaaa
col4 b b b b a b b b b.o. b
col5 3 4 4 4 4 4 4 3 2 2
N M N M N N M S S N
418 419 420 421 422 423 424 425 426 427
=09
=09
=09
=09
=09
428 429 430 431 432 433 434 435 436 437
col1 M M M izlo=C4=8Den-ni anketnega lista =C5=BD M izlo=C4=8Den, ni anketn=
ega lista M M =C5=BD
col2 20 23 17 28 24 23 23 22
col3 aaaaa aa =C5=A1aaa aaaa=C5=BEaa aaaaa aaaaaaa-aaaaaaaaaaaaaaa a aaaaaa=
a-aaaaaaaaa aaaaa aa =C5=A1aaa aaaaaaaaaaa aaaaa=C4=8Daaaa aaaaaaa aaaaaa=
=2Daaaaaaaaa aa aaaaaaaaa =C4=8Daaaaaa aaaaa aaaaa-aaaaa aa =C5=A1aaa
col4 a b b b b a a a
col5 3 3 5 5 3 3 2 4
E N S b.o. M S E E
428 429 430 431 432 433 434 435 436 437
=09
=09
=09
=09
=09
=09
438 439 440 441 442 443 444 445 446 447
col1 M M M M =C5=BD =C5=BD =C5=BD M =C5=BD M
col2 24 25 26 24 22 16 25 23 24 24
col3 aaaaaaaaa aaaaaa aaaaaaaa aaaaaa aaaaaaaaaaaa =C5=A1aaaaaa-aaaaa aa =
=C5=A1aaa =C5=A1aaaaaa-aaaaa aa =C5=A1aaa aaaaaaaaa-aaaaa aa =C5=A1aaa aaaa=
aaaaaa,aaaaaaa aaaaaaa aaaaaaaaaa aaaa=C4=8Daaaa=C4=8Daa
col4 b b b b b b b a b a
col5 4 4 3 3 4 5 3 3 4 3
M N N N M N N N M M
438 439 440 441 442 443 444 445 446 447
438 439 440 441 442 443 444 445 446 447
=09
=09
=09
=09
448 449 450 451 452 453 454 455 456 457
col1 M =C5=BD M M M M M =C5=BD =C5=BD M
col2 23 23 24 25 23 22 23 24 16 24
col3 =C5=A1aaaaaa =C5=A1aaaaaa aaaa aaaaaaa aaaaaa aaaaaaaaaaaa aaaa aaaaa=
aa a.a. =C5=A1aaaaaaaa aaaaaaaaa aaaaaaa
col4 a b brez poklica b b brez poklica b.o. b iz b v a b
col5 3 4 4 b.o. 3 2 4 5 4 4
448 449 450 451 452 453 454 455 456 457
M M M N S S E N N N
448 449 450 451 452 453 454 455 456 457
=09
=09
=09
=09
=09
=09
=09
458 459 460 461 462 463 464 465 466=09
col1 M M M M M M =C5=BD M M=09
col2 25 24 20 25 25 20 20 24 21=09
col3 =C5=A1aaaaaa aaaaaa=C5=A1=C4=8Daaa aaaa aaaaaaa aaaaaaaa aaaaaaa aaaaa=
aa-aaaaaaa aaaaaaaaa aaaaaaaaaa aaaaa=09
col4 a a a b.o. a b b a b=09
col5 4,5 3 3 4 do 5 3 3 4 4 3=09
N M M M M E M E M=09
458 459 460 461 462 463 464 465 466=09
458 459 460 461 462 463 464 465 466=09

--Boundary-00=_IGG8AgQIywgLsRv
Content-Type: text/plain;
charset="us-ascii";
name="parser2.rb"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="parser2.rb"

require 'generator'

def usage(msg); puts msg; exit 1; end

def parse_line(persons, lines, member, col_name)
while (line = lines.next.chomp) !~ /^#{Regexp.escape(col_name)}/
#puts "#{line} doesn't match skipping it"
end
values = line.split /\t/
values.shift # remove the column name
# s = SyncEnumerator.new(persons, values)
s = persons.zip(values)
s.each { |person, value|
if person.send(member)
# not overwriting if value is not nil
next
end
_value = if block_given?; yield value; else value; end
person.send(member.to_s + '=', _value)
}
end

class Person
attr_accessor :person_id, :is_male, :age, :job, :state,
:education

def initialize(person_id)
@person_id = person_id
end
end

if $0 == __FILE__

fName = ARGV.shift || usage("missing filename of statistics file")
f = File.open(fName)
lines = Generator.new(f)

all_persons = []

while lines.next?

line = lines.next.chomp while (lines.next? && line !~ /^(\t\d+)+\s*$/)
break if line !~ /^(\t\d+)+\s*$/

# found a series of data. first line => series
# of person numbers
person_ids = line.scan(/\d+/).map {|i| i.to_i}
line = nil # clear for next iteration
persons = person_ids.map { |p_id| Person.new(p_id) }

# now the series of genders
parse_line(persons, lines, :is_male, 'col1') { |gender_s| (gender_s == 'M') }
# now the age
parse_line(persons, lines, :age, 'col2')
# now the job
parse_line(persons, lines, :job, 'col3')
# now the stanje
parse_line(persons, lines, :state, 'col4')
# now the education
parse_line(persons, lines, :education, 'col5')

all_persons.concat(persons)
puts "parsed #{all_persons.size} people"
File.open('/proc/meminfo') {|memi| puts memi.grep(/MemFree/)} if RUBY_PLATFORM =~ /linux/
end

f.close

File.open('data.dta', 'w') do |file|
Marshal.dump(all_persons, file)
end

end

--Boundary-00=_IGG8AgQIywgLsRv--

--Boundary-00=_IGG8AgQIywgLsRv--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,039
Messages
2,570,376
Members
47,029
Latest member
EmiliaSton

Latest Threads

Top