Perl/RespectTheGlobalStateOfTheFlipFlopOperator

내용출력

로그인[l]

원문: [Respect the global state of the flip flop operator « The Effective Perler]
요점과 예제 코드 정리
Perl/RangeOperator 참고
잘못 해석한 부분 보이시면 지적 부탁드립니다~

Perl의 플립-플롭 연산자 ..(스칼라 컨택스트에서 사용될 때. 리스트 컨택스트에서는 범위를 명시하는 range 연산자로 사용됨1)는 좌변이 참이 될 때까지는 거짓을 반환한다. 일단 좌변이 참이 되면, 그 때부터는 우변이 참이 될 때까지 참을 반환한다. 즉 좌변은 연산자를 '켜고' 우변은 '끈다'.

간단하게 START와 END 마커가 있는 파일을 살펴보자:

# input.txt
Ignore this
Ignore this too
START
Show this
And this
Also this
END
Don't show this
Or this

두 마커 사이의 내용만 추출하고 싶다2:

# flip-flop
while( <> ) {
        say if /START/ .. /END/;
        }

실행결과:

% perl flip-flop input.txt
START
Show this
And this
Also this
END

플립-플롭 연산자가 거짓 상태로 되돌아가고 나면, 좌변이 참이 되면 다시 참이 된다.

마커로 표시된 영역이 두 군데 있는 파일:

# input2.txt
Ignore this
Ignore this too
START
Show this
And this
Also this
END
Don't show this
Or this
START
Show this again
And this again
Also this again
END
But ignore this

실행하면 두 영역이 출력된다:

% perl flip-flop input2.txt
START
Show this
And this
Also this
END
START
Show this again
And this again
Also this again
END

플립-플롭 연산자의 현재 상태를 모르는 채로 두 번 이상 사용할 경우는 좀 복잡해진다. 프로그램을 수정하여, 모든 파일을 ARGV 파일핸들에 넣어 처리하지 않고 각각 따로 처리하도록 해보자:

foreach my $file ( @ARGV )
        {
        open my $fh, '<', $file or die "Could not find $file\n";
        while( <$fh> ) {
                say if /START/ .. /END/;
                }
        }

테스트를 위해서 입력 파일을 input2a.txt 와 input2b.txt 로 나누고 각 파일에 출력할 영역이 하나씩 있도록 하자:

# input2a.txt
Ignore this
Ignore this too
START
Show this
And this
Also this
END
Don't show this

# input2b.txt
Or this
START
Show this again
And this again
Also this again
END
But ignore this

실행 결과는 먼젓번 프로그램과 동일하게 보인다:

% perl flip-flop input2a.txt input2b.txt
START
Show this
And this
Also this
END
START
Show this again
And this again
Also this again
END

여기서부터가 혼란스러운 부분이다. 플립-플롭 연산자는 현재 어느 파일을 검사하고 있는지, 마지막 파일에 무슨 일이 있었는지 등을 신경쓰지 않는다. 문제가 되는 상황을 만들기 위해서, input2a.txt 파일을 수정하여 END 마커를 지운다:

# input2a.txt
Ignore this
Ignore this too
START
Show this
And this
Also this
Don't show this

input2a.txt 에서 제대로 영역이 끝나지 않았기 때문에, 두번째 파일을 읽기 시작할 시점에도 플립-플롭 연산자는 여전히 참이다:

% perl flip-flop input2a.txt input2b.txt
START
Show this
And this
Also this
Don't show this
# input2b.txt
Or this
START
Show this again
And this again
Also this again
END

각 플립-플롭 연산자는 자신만의 전역 상태값을 유지한다. 이 연산자는 새로운 루프, 새로운 반복, 기타 등등 아무 것에도 영향을 받지 않는다. 심지어 연산자가 서브루틴 안에 있어도 안전하지 않다. perl이 컴파일한 연산자는 고유의 상태값을 가지며, perl은 서브루틴을 단 한 번 컴파일하기 때문이다:

foreach my $file ( @ARGV )
        {
        open my $fh, '<', $file or die "Could not find $file\n";
        extract( $fh );
        }

sub extract {
        my( $fh ) = shift;

        while( <$fh> ) {
                print if /START/ .. /END/; # this is the same .. on every call
                }
        }

위 프로그램의 결과는 이전 것과 동일하다.

매번의 반복마다 새로운 플립-플롭 연산자를 컴파일하면 될 거라고 생각할 수 있다. 그러나 다음 코드의 경우는 원하는 대로 동작하지 않는다. perl이 코드를 컴파일할 때, 매번 반환되는 익명 서브루틴이 항상 동일한 코드란 걸 파악하고 재사용하기 때문이다:

foreach my $file ( @ARGV )
        {
        open my $fh, '<', $file or die "Could not find $file\n";
        make_extractor()->($fh);
        }

sub make_extractor {
        sub { # only compiled once
                my( $fh ) = shift;

                while( <$fh> ) {
                        print if /START/ .. /END/;
                        }
                };
        }

검증하고 싶다면 make_extractor의 반환값을 덤프해서 볼 수 있다:

# dump-subs.pl
use Devel::Peek;

my @subs = map { make_extractor() } 1 .. 3;

print Dump( $_ ) foreach @subs;

sub make_extractor {
        sub { # only compiled once
                my( $fh ) = shift;

                while( <$fh> ) {
                        print if /START/ .. /END/;
                        }
                };
        }

매번 동일한 서브루틴을 얻으며, 이 말은 동일한 플립-플롭을 얻는다는 뜻이다:

% perl dump-subs.pl
SV = RV(0x80f66c) at 0x80f660
  REFCNT = 2
  FLAGS = (ROK)
  RV = 0x81a4f0
  SV = PVCV(0x80e4b8) at ...
SV = RV(0x80f6fc) at 0x80f6f0
  REFCNT = 2
  FLAGS = (ROK)
  RV = 0x81a4f0
  SV = PVCV(0x80e4b8) ...
SV = RV(0x8030bc) at 0x8030b0
  REFCNT = 2
  FLAGS = (ROK)
  RV = 0x81a4f0
  SV = PVCV(0x80e4b8) ...

따라서 각 서브루틴을 어떤 식으로든 다르게 만들어야 한다. 해결책은 클로져, 즉 스코프 밖에 있는 렉시컬 변수를 참조하는 서브루틴을 사용하는 것이다. 아래의 코드에서, state 의 도움을 받아서 플립-플롭 연산자가 얼마나 많이 만들어졌는지 확인하고 있으며, 각각의 새로운 익명 서브루틴은 $count의 값을 얻어내야 하기 때문에 perl이 기존에 정의한 익명 서브루틴을 재사용할 수 없다. 매번 새로운 서브루틴을 만들도록 강제하는 것이다:

# flip-flop
use 5.010;

foreach my $file ( @ARGV )
        {
        open my $fh, '<', $file or die "Could not find $file\n";
        make_extractor()->($fh);
        }

sub make_extractor {
        state $count = 0;
        $count++;

        sub {
                my( $fh ) = shift;

                while( <$fh> ) {
                        print "$count: $_" if /START/ .. /END/;
                        }
                };
        }

이제는 각 입력 파일마다 별도의 플립-플롭 연산자를 사용하게 된다. 첫번째 입력 파일이 (종료 마커가 없는 상태로) 끝나고 두번째 파일이 시작할 때 새로운 플립-플롭이 적용되는 걸 볼 수 있다:

% perl flip-flop input2a.txt input2b.txt
1: START
1: Show this
1: And this
1: Also this
1: Don't show this
2: START
2: Show this again
2: And this again
2: Also this again
2: END

더 자세한 정보는 [perlop 문서의 Range Operators 섹션]을 참고하라.

요점:

모든 플립-플롭 연산자는 각자 전역상태값을 유지한다.
플립-플롭 연산자는 스코프의 적용을 받지 않는다.
클로져로 감쌈으로써 새로운 플립-플롭을 생성하라.

컴퓨터분류

각주:
1. 원문의 "otherwise known as the range operator in scalar context"는 "list context"의 오류인 듯
2. 각 라인을 읽고 chomp를 하지 않은 상태인데 이걸 다시 say로 출력하기 때문에 각 라인 사이에 빈 줄이 더 출력된다. 그냥 print를 쓰는게 맞을 듯?

마지막 편집일: 2012-2-11 12:25 am (변경사항 [d])
1043 hits | Permalink | 변경내역 보기 [h] | 페이지 소스 보기