[cg] Unification in complex contexts

Discussion:

Atro Voutilainen

2015-03-06 12:11:25 UTC

Hi,

I wrote a rule to SELECT two coordinated finite verb readings (on the basis
of a uniqueness generalisation), but the rule disambiguated only the first
one (the target). The grammar looks something like this:

LIST vfin = (V Pres) (V Imp) (V Past) ;

SELECT $$vfin IF (*-1C clb BARRIER vfin) (*1C (CC) BARRIER vfin OR clb LINK
1 $$vfin LINK *1C clb BARRIER vfin) ;

Is this something to look at? I can provide a concrete example if needed.

Best,
Atro

--
You received this message because you are subscribed to the Google Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to constraint-grammar+***@googlegroups.com.
To post to this group, send email to constraint-***@googlegroups.com.
Visit this group at http://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

Tino Didriksen

2015-03-06 12:19:15 UTC

Permalink

SELECT can only affect 1 cohort at a time. If you want to also remove
readings from the paired cohort, you need another SELECT rule that searches
backwards.

-- Tino Didriksen

Post by Atro Voutilainen
Hi,
I wrote a rule to SELECT two coordinated finite verb readings (on the
basis of a uniqueness generalisation), but the rule disambiguated only the
LIST vfin = (V Pres) (V Imp) (V Past) ;
SELECT $$vfin IF (*-1C clb BARRIER vfin) (*1C (CC) BARRIER vfin OR clb
LINK 1 $$vfin LINK *1C clb BARRIER vfin) ;
Is this something to look at? I can provide a concrete example if needed.
Best,
Atro

Atro Voutilainen

2015-03-06 12:22:19 UTC

Permalink

Thanks!
Atro

Post by Tino Didriksen
SELECT can only affect 1 cohort at a time. If you want to also remove
readings from the paired cohort, you need another SELECT rule that searches
backwards.
-- Tino Didriksen

--
You received this message because you are subscribed to the Google Groups
"Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

JOSE MARIA ARRIOLA

2016-08-31 14:59:53 UTC

Permalink

Hi,
when applying the CG3 disambiguation grammar for Basque I have got the
following message:
/Warning: Hard limit of 500 cohorts reached at line 2.219 - forcing
break./
What does it mean?

Thank you very much.

Jose Mari Arriola

Post by Tino Didriksen
SELECT can only affect 1 cohort at a time. If you want to also remove
readings from the paired cohort, you need another SELECT rule that searches
backwards.
-- Tino Didriksen

--
You received this message because you are subscribed to the Google
Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it,
Visit this group at http://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to constraint-grammar+***@googlegroups.com.
To post to this group, send email to constraint-***@googlegroups.com.
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

Tino Didriksen

2016-08-31 15:23:40 UTC

Permalink

It means that in 500 words of the input, none matched the Delimiters you
defined in the grammar.

This usually means the input is malformed somehow (very long run-on
sentences), or your delimiters are underspecified.

If there truly are no valid hard delimiters, you can use Soft-Delimiters to
introduce some sort of structure (usually commas or similar soft
line-of-thought breaks).

-- Tino Didriksen

Post by JOSE MARIA ARRIOLA
Hi,
when applying the CG3 disambiguation grammar for Basque I have got the
*Warning: Hard limit of 500 cohorts reached at line 2.219 - forcing break.*
What does it mean?
Thank you very much.
Jose Mari Arriola

--
You received this message because you are subscribed to the Google Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to constraint-grammar+***@googlegroups.com.
To post to this group, send email to constraint-***@googlegroups.com.
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

JOSE MARIA ARRIOLA

2016-09-01 10:06:04 UTC

Permalink

Ok. Thank you very much.
JM

Post by Tino Didriksen
It means that in 500 words of the input, none matched the Delimiters you
defined in the grammar.
This usually means the input is malformed somehow (very long run-on
sentences), or your delimiters are underspecified.
If there truly are no valid hard delimiters, you can use Soft-Delimiters to
introduce some sort of structure (usually commas or similar soft
line-of-thought breaks).
-- Tino Didriksen

--
You received this message because you are subscribed to the Google
Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it,
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to constraint-grammar+***@googlegroups.com.
To post to this group, send email to constraint-***@googlegroups.com.
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

Edward Garrett

2016-09-01 10:11:32 UTC

Permalink

I have run into this issue myself. Can the hard limit be changed? If so,
does performance diminish considerably as this limit is increased?

I ask because it can be convenient for some purposes, and some languages or
registers, to avoid defining the "sentence", and to break texts into
considerably larger chunks.

On Thu, Sep 1, 2016 at 11:06 AM, JOSE MARIA ARRIOLA <

Post by JOSE MARIA ARRIOLA
Ok. Thank you very much.
JM
It means that in 500 words of the input, none matched the Delimiters you

Post by Tino Didriksen
defined in the grammar.
This usually means the input is malformed somehow (very long run-on
sentences), or your delimiters are underspecified.
If there truly are no valid hard delimiters, you can use Soft-Delimiters to
introduce some sort of structure (usually commas or similar soft
line-of-thought breaks).
-- Tino Didriksen
Hi,

Post by JOSE MARIA ARRIOLA
when applying the CG3 disambiguation grammar for Basque I have got the
*Warning: Hard limit of 500 cohorts reached at line 2.219 - forcing break.*
What does it mean?
Thank you very much.
Jose Mari Arriola

--
You received this message because you are subscribed to the Google Groups
"Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to constraint-grammar+***@googlegroups.com.
To post to this group, send email to constraint-***@googlegroups.com.
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

Tino Didriksen

2016-09-01 10:17:44 UTC

Permalink

You can set your own limits with the options:
--soft-limit number of cohorts after which the
SOFT-DELIMITERS kick in; defaults to 300
--hard-limit number of cohorts after which the window is
forcefully cut; defaults to 500

Performance will depend on what kind of rules you have. If they'll
indiscriminately scan the whole "sentence" that can take a while. If
they're limited to closer contexts, it makes no difference how big the
sentence is.

-- Tino Didriksen

Post by Edward Garrett
I have run into this issue myself. Can the hard limit be changed? If so,
does performance diminish considerably as this limit is increased?
I ask because it can be convenient for some purposes, and some languages
or registers, to avoid defining the "sentence", and to break texts into
considerably larger chunks.

--
You received this message because you are subscribed to the Google Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to constraint-grammar+***@googlegroups.com.
To post to this group, send email to constraint-***@googlegroups.com.
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

JOSE MARIA ARRIOLA

2016-09-19 15:55:29 UTC

Permalink

Hi,
After being exploring the differences between the application of the
grammar by itself and the grammar inside of the module for tagging. We
have observed a problem just inside the module of tagging. The thing
is that there areÂ some offset marks that are included with the
delimiters, for instance:
/"<$.>"<PUNT_PUNT#7824-7824#p47#>"/ instead ofÂ Â "<$.>"<PUNT_PUNT>"
/#7824-7824#p47# /tag includes information about line number and
paragraph number. The problem is that we get the following message:
WARNING: Hard limit of 500 cohorts reached at line 2.219 - forcing
break.
It is due to the fact that the delimiters are not recognized.Â We
define the delimiters in our grammar in a static way. Is it possible
to define the delimiters by means of a regular expression?

Thank you,
Jose Mari Arriola

Â Â Â --soft-limitÂ Â Â Â Â Â number of cohorts after which the
SOFT-DELIMITERS kick in; defaults to 300
Â Â Â --hard-limitÂ Â Â Â Â Â number of cohorts after which the window is
forcefully cut; defaults to 500
Performance will depend on what kind of rules you have. If they'll
indiscriminately scan the whole "sentence" that can take a while. If
they're limited to closer contexts, it makes no difference how big the
sentence is.
-- Tino Didriksen

--
You received this message because you are subscribed to the Google
Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it,
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to constraint-grammar+***@googlegroups.com.
To post to this group, send email to constraint-***@googlegroups.com.
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

Tino Didriksen

2016-09-19 16:51:40 UTC

Permalink

Instead of embedding the information into the tag itself, you can put it as
static information on the cohort, a'la:
"<$.>" PUNT_PUNT#7824-7824#p47#

That is, the wordform followed by a space and then any tags you want. The
caveat is that the tags will be visible to rules, but if they're esoteric
enough then it won't matter.

But yes, you can also use regex for delimiters, if you so desire.

-- Tino Didriksen

Post by JOSE MARIA ARRIOLA
Hi,
After being exploring the differences between the application of the
grammar by itself and the grammar inside of the module for tagging. We have
observed a problem just inside the module of tagging. The thing is that
there are some offset marks that are included with the delimiters, for
*"<$.>"<PUNT_PUNT#7824-7824#p47#>"* instead of "<$.>"<PUNT_PUNT>"
*#7824-7824#p47#* tag includes information about line number and
*Warning:* Hard limit of 500 cohorts reached at line 2.219 - forcing
break.
It is due to the fact that the delimiters are not recognized. We define
the delimiters in our grammar in a static way. Is it possible to define the
delimiters by means of a regular expression?
Thank you,
Jose Mari Arriola

--
You received this message because you are subscribed to the Google Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to constraint-grammar+***@googlegroups.com.
To post to this group, send email to constraint-***@googlegroups.com.
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

JOSE MARIA ARRIOLA

2016-09-19 20:14:37 UTC

Permalink

Thank you very much, Tino.
Jose Mari

Post by Tino Didriksen
Instead of embedding the information into the tag itself, you can put it as
"<$.>" PUNT_PUNT#7824-7824#p47#
That is, the wordform followed by a space and then any tags you want. The
caveat is that the tags will be visible to rules, but if they're esoteric
enough then it won't matter.
But yes, you can also use regex for delimiters, if you so desire.
-- Tino Didriksen

--
You received this message because you are subscribed to the Google
Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it,
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email to constraint-grammar+***@googlegroups.com.
To post to this group, send email to constraint-***@googlegroups.com.
Visit this group at https://groups.google.com/group/constraint-grammar.
For more options, visit https://groups.google.com/d/optout.