Legal Robot will publish this report quarterly, the next being on or around January 1st, 2018.
On January 12, 2017, Legal Robot publicly committed to implementing principles for Algorithmic Transparency. This is our third report since making that commitment and are now publishing self-ratings on our progress toward implementation.
We will make our owners, designers, builders, users, and other stakeholders of analytic systems aware of the possible biases involved in their design, implementation, and use and the potential harm that biases can cause to individuals and society.
In an effort to improve the general awareness around Algorithmic Transparency, our CEO, Dan Rubins, traveled to Washington D.C. to speak at the Association for Computing Machinery’s (ACM) Panel on Algorithmic Transparency. The panel discussed the challenges, opportunities, business value, and societal impacts of algorithms with a diverse and lively crowd of political staffers, lobbyists, academics, and other stakeholders.
We will adopt mechanisms that enable questioning and redress for individuals and groups that are adversely affected by algorithmically informed decisions.
Most predictions in our app have a button to visualize and examine the details of the result, however we don’t provide this for basic operations like sentence segmentation, part-of-speech tagging, and other NLP operations that are fairly well understood by the NLP community. Where appropriate, we also include statistical measures like precision, recall, and F1 score, as well as the size, source, and scope of the underlying dataset, and details about the design of the algorithm used for the prediction. Of course, we don’t expect everyone to be able to interpret this technical data, so we also allow anyone to share the results with our team for more explanation.
Also, anyone can ask questions over email to [email protected], even if they are not using Legal Robot. These questions are tracked separately from our normal support requests.
We will demonstrate to our users how decisions are made by the algorithms that they use, even if it is not feasible to explain in detail how the algorithms produce their results.
Many of our processes at Legal Robot use deep neural networks to process language. Neural networks can be very complex which can make them seem incomprehensible. However, just because an algorithm seems like a black box (and is treated that way by many people using it) does not mean it cannot be explained.
To begin with, we do not use any 3rd party machine learning APIs at Legal Robot. This is mainly so we can control where data processing occurs. Rather than passing sensitive data to a 3rd party as many “AI” companies do, we actually build our own algorithms so we can open up the internals for further analysis and explanation.
We now tag each prediction created by our software with a unique random identifier that can be used to trace back to both the algorithm and the training dataset used for each prediction in order to enable questioning and redress.
We will produce explanations regarding both the procedures followed by the algorithm and the specific decisions that are made.
Some of the techniques we use yield dense vectors (basically a long string of seemingly incomprehensible numbers, like [0.78524 , 0.42504, 0.60494, …]) that we use to teach an algorithm what a particular type of clause looks like (statistically speaking). However, we are working on methods to make these dense vectors more interpretable, much the same way that deep learning techniques can yield semi-interpretable layer visualizations in computer vision. We think these can provide some utility for users to understand what is happening inside the “black box.” We are focusing on these areas over the next few releases and intend to publish our results to the research community.
We will provide a description of the way in which the training data was collected, along with an exploration of the potential biases induced by the human or algorithmic data-gathering process.
Every model created by Legal Robot is traceable to the specific dataset. Every data point also includes detail on how and why each sample was collected, and the details of any enrichment or manual tagging.
All of our models, algorithms, and datasets are now versioned and recorded, providing a full audit trail. We have not yet set a policy or provided a mechanism to view or download the audit trail, but are planning to release this feature soon.
We will use rigorous methods to validate our models and document those methods and results. In particular, we will explore ways to conduct routine tests to assess and determine whether the model generates discriminatory harm. We will publish a description of the methods and the results of such tests in each quarter's transparency report.
We are working on a structured approach to analyzing bias to capture both known and unknown biases. In addition to this high-level approach, we are investigating lower level techniques like attribution to detect and evaluate bias. This quarter, we started to use automated bias analysis on some of our models, but there is still much work to do by the research community.
In quite possibly one of the most widely damaging security breaches ever, Equifax, a consumer credit reporting agency, seems to have left a server unpatched for months after repeated notification and public disclosure of a serious flaw. It should be no surprise to any programmer that all software has flaws, but it is basic security hygiene to both update software regularly and verify that countermeasures against attackers are effective. We, of course, update our servers regularly and pay attention to all security bulletins. However, human processes are fallible and prone to failure, and we want to learn what we can from Equifax’s failures. So, this quarter, we also added new automated version checking and vulnerability scanning steps to our continuous build process. In simpler terms, all code is now checked for vulnerabilities and outdated versions when we write any new code or refresh any existing infrastructure.
Deloitte’s email system was breached when attackers found an administrator account using only a password and no 2-Factor Authentication (2FA). Again, we should learn from this. At Legal Robot, all of our administrator accounts across all core services (domain, email, databases, registrar, content delivery, etc.) have always required 2FA and often use additional protections as well. However, in our own service, we do not yet provide users or Team administrators with a mechanism to enforce 2FA as a policy for their team members or collaborators. Given the sensitivity of legal documents, we feel that it is important to add this feature to our own product soon.
Starting with the last transparency report, we began publishing statistics on our bug bounty program, links to disclosed bug reports, and detailed incident reports for serious security issues. This quarter saw a huge spike in reports for two reasons:
The last report contained a summary of every month of our bug bounty program since inception. However, since this information is available in our archive, we will only publish the most recent two quarters of results going forward.
|New||Triaged||Needs More Info||Resolved||Informative||Duplicate||Not Applicable||Spam|
We intend to disclose all reports, once closed. However, we also respect the wishes of security researchers that are working with other organizations to resolve related issues. This quarter, we publicly disclosed the following Resolved reports:
|#249346 - Missing link to 2FA recovery code||Functional issue||None|
|#230525 - Domain takeover (legalrobot.co.za)||Domain takeover||None|
|#250457 - User enumeration||Enumeration||Low|
|#249798 - Intercom chat session information persists after logout||Improper access||Low|
|#250243 - Users with 2FA can have multiple sessions||Functional issue||None|
|#249337 - Non-functional 2FA recovery codes||Functional issue||None|
|#213936 - Token leakage by referrer||Token leakage||Low|
|#250088 - Account profile shows encryption recovery box for all users||Functional issue||None|
|#250741 - [New Feature] Password history check||Functional issue||None|
|#252544 - Token leakage by referrer header & analytics||Token leakage||Low|
|#251468 - Pages don’t render in old browsers like IE11||Functional issue||None|
|#251469 - Meta characters are not filtered into full name on profile page||Functional issue||None|
|#253448 - [Cross-domain Referer leakage] Password reset token leakage via referer||Token leakage||Low|
|#251526 - No notification on change password feature||Functional issue||None|
|#249695 - 2FA Error Handling on Google Authenticator||Error||None|
|#255021 - Profile shows incorrect account creation date||Functional issue||None|
|#250082 - Enhancement: email confirmation for 2FA recovery||Enhancement||None|
|#249339 - Missing link to TOTP manual enroll option||Enhancement||None|
|#249467 - 2FA user enumeration via login||Enumeration||None|
|#257207 - Code injection||Code injection||Low|
|#249431 - 2FA user enumeration via password reset||Enumeration||Low|
|#259416 - Incorrect email content when disabling 2FA||Functional issue||None|
|#259415 - Lengthy manual entry of 2FA secret||Functional issue||None|
|#256649 - Mixed Content over HTTPS||Functional issue||None|
|#259742 - Incorrect error message||Functional issue||None|
|#260604 - Update any profile||Improper access||Medium|
|#260278 - TabNabbing issue (due to taget=_blank)||Enhancement||Low|
|#260632 - Improper validation of parameters while creating issues||Missing validation||Low|
|#180895 - Password reset access control||Logic issue||None|
|#213180 - Password reset form ignores email field||Functional issue||None|
|#255679 - Change password logic inversion||Improper access||Low|
|#251200 - Missing Issuer parameter on TOTP 2FA||Functional issue||None|
|#262109 - UX: JS error on Password Safety link||Functional issue||None|
|#249398 - Password complexity not evenly enforced||Functional issue||None|
|#250253 - Password complexity ignores empty spaces||Functional issue||None|
|#260648 - CSP script-src includes “unsafe-inline”||Missing best practice||Low|
|#260662 - No length limit in invite_code can cause server degradation||Missing best practice||Low|
|#255474 - Profile fields validation bypass||Functional issue||None|
|#265775 - Password reset token issue||Functional issue||None|
|#260468 - first name and last name restrictions bypass||Functional issue||None|
|#257035 - User enumeration from failed login error message||Enumeration||Low|
|#266017 - Logic issue in email change process||Improper access||Low|
|#164648 - Missing access control at password change||Functional issue||None|
|#267356 - Autocomplete feature||Functional issue||None|
|#260299 - observer.com URL should HTTPS||Functional issue||None|
|#260491 - 2FA manual entry uses wrong encoding||Functional issue||None|
|#260591 - Futureoflife organization URL should be HTTPS||Functional issue||None|
|#260316 - Profile fields validation mismatch||Functional issue||None|
|#260938 - Homograph IDNs displayed in Description||Functional issue||None|
|#260941 - UX: JS error on Password Safety link||Functional issue||None|
|#268629 - Failed OutLink on Terms of Service||Functional issue||None|
|#269288 - External links to be in HTTP||Functional issue||None|
|#268981 - Missing homograph filter character||Functional issue||None|
|#260390 - 2FA manual entry uses wrong encoding||Functional issue||None|
|#255481 - app.legalrobot.com opens FireFox but not in FireFox ESR||Functional issue||None|
|#255100 - No error or notification on Reset password page||Functional issue||None|
|#259400 - Issues with Forgot password Error Handling||Functional issue||None|
|#261285 - Privilege Escalation to Admin-level Account||Privilege Escalation||High|
We also publicly disclosed the following reports which were not evaluated to have any security impact.
|#254895 - SSL BREACH attack (CVE-2013-3587)||Informative|
|#255041 - LUCKY13 (CVE-2013-0169) effects legalrobot.com||Informative|
|#216330 - Big XSS vulnerability!||Spam|
|#250766 - Subdomain misconfiguration [mail.legalrobot.com]||Informative|
|#254927 - Lack of input validation in e-mail & user name, job title, company name field||Informative|
|#255020 - Password Reset page Session Fixation||Not Applicable|
|#260239 - Tampering the mail id on chatbox||Informative|
|#260689 - Weak Cryptography for Passwords||Informative|
|#260751 - Change password session fixed||Spam|
|#263196 - Name can’t be numbers or email||Informative|
|#262140 - Password Restriction On Change||Informative|
|#263589 - Email Length Verification||Spam|
|#255025 - Create Api Key is not working||Informative|
|#260838 - Special characters are not filtered out on profile fields||Informative|
|#261817 - Information disclosure||Informative|
|#178990 - The websocket traffic is not secure enough||Informative|
|#166231 - CSRF Issue||Informative|
|#163730 - News Feed Detected||Spam|
|#263846 - Registration Allows Disposable Email Addresses||Informative|
|#213767 - Password Policy Bypass||Informative|
|#264023 - Coding error !||Duplicate|
|#263743 - I cant login to my account||Informative|
|#264101 - design issue exists on login page||Spam|
|#260492 - Invalid Email Verification||Informative|
|#189023 - S3 ACL misconfiguration||Informative|
|#165542 - clickjacking at http://mailboxes.legalrobot-uat.com/||Not Applicable|
|#263681 - Improper error message||Informative|
|#265619 - No alert in verify email address with wrong input||Informative|
|#265441 - Error the message with already e-mail||Informative|
|#265749 - Bypass email verification when register new account||Not Applicable|
|#263728 - Password Complexity||Not Applicable|
We require all members of the Legal Robot community to abide by our Code of Conduct. As of the date of this report, we have not received any reports alleging violations of our code of conduct.
For more information around what inspired this statement go to https://www.canarywatch.org.
As of October 1st, 2017:
Special note should be taken if this transparency report is not updated by the expected date at the top of the page, or if this section is modified or removed from the page.
The canary scheme is not infallible. Although signing the declaration makes it difficult for a third party to produce this declaration, it does not prevent them from using force or other means, like blackmail or compromising the signers’ laptops, to coerce us to produce false declarations.
Legal Robot has not received any “take down” notices or other removal requests under the Digital Millennium Copyright Act (“DMCA”) or any other regulation like Article 12 of Directive 95/46/EC, or the newer Article 17 of the General Data Protection Regulation (“GDPR”), commonly known as the “right to be forgotten”.
The news quotes below show this report could not have been created prior to October 1st, 2017.
-----BEGIN PGP SIGNATURE----- wsFcBAEBCAAQBQJZ0eFNCRCY0PbwMF7zeAAAGiIQAFG70cz1qRFdpCJzFWam1EJy NbaiNSXoIkCX1aBskDhH2MAb0fakUb4KvYWIaD1v1yVQ6gbEa9TkuKUHBBJlZ546 bOLBFdvX9MTqHKBM2Wl0jsHGMZQ9ovGl84H8EqrMChzlcERo6J0r1NYcXcbLVtvm 2QnkLejYBs+vza0NTX55DtXF1Bc+DOzyPUepsGx8YlccKKZG8j9zwmi/kE2tSgeu n3DRSp/wzlayiEIkTf/iWDUFo8uyLrtUXskjGDTMmdP7x29JfxYsBg6OD894m8e1 M+fryCwSeq2WFCV0FRA1Tyv3w18dIkwxix8blHm1JgQiuvR5cDBoXQa/HIq4B0ij acg9uDlA2wHDDFneIr4jzewirHbhYrbAxE83aS4aDkCdz4H651ySi3OnyCxFrCmI DC/Fw/U8RAmJp2qGVDX6TWvffIcR7OKfQ1ejbzcy/yuQi8AOiWOpKKFXlak1jxb4 hJeGQ36kWTo4wexaBmquN9BM499kaMxzupzL9hr4XBrLOGmZe2+MMZGx39R2myaR rmIK16NT1/Wd0aQTUbXodFCJ9DGtzABUgJerBxpzY1Lh1s8tnPVYWttJlrxLa9ks XhprArhB1+V5wMvyZjbhiYmWL0LDDMgsb9VF1b2JK3PZyNr6mmUGhMuGlJXD5t2L FqUjak6xMxMNIXFs3DdW =ef/l -----END PGP SIGNATURE-----